The Complexity of the Consistency and A^-representabihty 
Problems for Quantum States 

Yi-Kai Liu 
Computer Science and Engineering 
University of California, San Diego 
y91iu@cs . ucsd . edu 

Aug. 22, 2007 
(slightly revised version, Dec. 17, 2007) 



Abstract 



QMA (Quantum Merlin- Arthur) is the quantum analogue of the class NP. There are 
a few QMA-complete problems, most of which are variants of the "Local Hamiltonian" 
problem introduced by Kitaev. In this dissertation we show some new QMA-complete 
problems which are very different from those known previously, and have applications 
in quantum chemistry. 

The first one is "Consistency of Local Density Matrices" : given a collection of den- 
sity matrices describing different subsets of an n-qubit system (where each subset has 
constant size), decide whether these are consistent with some global state of all n qubits. 
This problem was first suggested by Aharonov. We show that it is QMA-complete, via 
an oracle reduction from Local Hamiltonian. Our reduction is based on algorithms for 
convex optimization with a membership oracle, due to Yudin and Nemirovskii. 

Next we show that two problems from quantum chemistry, "Fermionic Local Hamil- 
tonian" and "A^-representability," are QMA-complete. These problems involve systems 
of fermions, rather than qubits; they arise in calculating the ground state energies of 
molecular systems. A^-representability is particularly interesting, as it is a key compo- 
nent in recently developed numerical methods using the contracted Schrodinger equation. 
Although these problems have been studied since the 1960's, it is only recently that the 
theory of quantum computation has provided the right tools to properly characterize 
their complexity. 

Finally, we study some special cases of the Consistency problem, pertaining to 1- 
dimensional and "stoquastic" systems. We also give an alternative proof of a result due 
to Jaynes: whenever local density matrices are consistent, they are consistent with a 
Gibbs state. 
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Chapter 1 

Introduction 



1.1 Overview 

Beginning in the 1980's, the field of quantum mechanics was reinvigorated by a new 
idea: that quantum mechanics has important consequences for machines that store and 
manipulate information. In particular, it appeared that quantum computers might be 
more powerful than classical computers. This opened up a new direction in computer 
science, and led to discoveries such as Shor's algorithm for factoring and discrete log- 
arithms [76], Grover's algorithm for black-box search ^39j, and the first schemes for 
fault-tolerant quantum computation [75]. Since then, the field of quantum computation 
has developed rapidly, and there is considerable interest in building practical quantum 
computers and finding new quantum algorithms. (See [68] for a survey of this area, as 
it stood in 2000.) 

In this dissertation we study complexity classes based on quantum computation. 
On one hand, this is motivated by the possibility that we may eventually succeed in 
building scalable quantum computers (thus demonstrating that this is a "reasonable" 
model of computation). But quantum complexity theory is also interesting because it 
gives new insights into problems that we care about, whether or not we have a quantum 
computer. This dissertation focuses on a few such problems, including some "real-world" 
problems from quantum chemistry, whose complexity is best characterized using ideas 
from quantum (as opposed to classical) computation. 

We study the "consistency problem for local quantum states," which is defined as 
follows (omitting some details). Suppose we have a system of n qubits, and we are given 
a collection of local density matrices pi, . . . , Pmj 

where each pi describes a subset Cj of 
the qubits. We assume that \Ci\ < k, for some fixed constant k. Then the problem is 
to decide whether the pi are "consistent," i.e., whether there exists some global state 
a (on all n qubits) that matches each of the pi on the subsets Cj. This problem was 
originally proposed by Dorit Aharonov [5] . This dissertation presents new results on the 
computational complexity of the consistency problem, as well as related problems from 
quantum chemistry and condensed matter physics. 

In chapter 2, we show that the consistency problem is QMA-complete, where QMA is 
the natural generalization of the complexity class NP to the setting of quantum computa- 
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tion. Before this, there was a canonical QMA-complete problem, the Local Hamiltonian 
problem, as shown by Kitaev |53j . Local Hamiltonian can be viewed as a generaliza- 
tion of Max-A;-SAT, or in physical terms, as the problem of estimating the ground state 
energy of a system of spins with local interactions. Subsequent work showed that the 
problem remains QMA-hard for 2-body interactions [HT], even when restricted to near- 
est neighbors on a 2-D square lattice [69]; the problem is also QMA-hard for nearest 
neighbors on a 1-D chain where each site is not a qubit, but a qudit of dimension d > 8 
[6l25j. However, these were essentially the only known QMA-complete problems (aside 
from a few problems which are closely related to the definition of QMA). With the 
Consistency problem, we give the first real example of a QMA-complete problem that 
is not a variant of Local Hamiltonian. In particular. Consistency is best described as a 
constraint satisfaction problem, while Local Hamiltonian is an optimization problem. 

We give a poly-time oracle reduction from Local Hamiltonian to Consistency, using 
algorithms for convex optimization with a membership oracle. This kind of reduction 
is quite unusual. After our paper was published, we became aware of work by Gurvits 
|40j that used a similar technique to show NP-hardness of the separability problem for 
quantum states, and also an older paper by Grotschel et al [37] that used a weaker tech- 
nique (convex optimization with a separation oracle) to show NP-hardness of weighted 
fractional chromatic number; but these seem to be the only previous examples. Here we 
develop the technique in greater detail. The usual approach is to use algorithms such as 
the shallow-cut ellipsoid method of Yudin and Nemirovskii \88\ I38j. or the random- walk 
algorithm of Bertsimas and Vempala [17^ |49] . We find that much simpler algorithms are 
sufficient for this application, because we only need to find approximate solutions (with 
accuracy ±1/ poly(n)), as opposed to exact solutions (accuracy ±2""). 

In chapter 3, we study the A^-representability problem, which is an analogue of the 
consistency problem for fermionic systems. (This chapter is joint work with Matthias 
Christandl and Frank Verstraete.) A^-representability was first introduced by quantum 
chemists in the 1960's, as a route to computing the ground states of molecular systems 
[301 [78| I28j : beginning in the 1990's, it has received renewed attention, thanks to im- 
proved variational methods based on semidefinite programming, and brand new methods 
such as the contracted Schrodinger equation ^291 HZl [M] . (Collectively these are known 
as 2-RDM methods.) 

We show that fermionic Local Hamiltonian is QMA-hard, by constructing a mapping 
from spin systems to fermionic systems. Then we show that A^-representability is QMA- 
hard, using the convex optimization technique from chapter 2. Ironically, this is the 
same idea that the quantum chemists use to design algorithms, but restated in a much 
more general form: we show that any efficient solution to A^-representability would imply 
an efficient algorithm to compute ground state energies, not just for molecules, but for 
generic local Hamiltonians — and this is QMA-hard. Finally, we show that fermionic 
Local Hamiltonian and A^-representability are in QMA, and hence are QMA-complete. 
In addition, we show that a related problem, pure-state A^-representability, is in the 
class QMA(2) (see chapter 3 for details). 

Our hardness result implies that 2-RDM methods must break down in the general 
case. But there is empirical evidence that 2-RDM methods perform well on instances that 
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arise in quantum chemistry. It would be wonderful to find some theoretical explanation 
for this. Is there some fundamental property of these instances that explains the success 
of 2-RDM methods? Also, it is not clear whether 2-RDM methods still give accurate 
results when scaled up to larger molecules; a theoretical analysis would be helpful in 
answering this question. 

In chapter 4, we study the consistency problem for 1-dimensional and "stoquastic" 
systems. These are interesting special cases, for which the Local Hamiltonian problem 
may not be QMA-hard. (Local Hamiltonian on a 1-D chain of qudits (for d > 8) is 
QMA-hard [6l US], but this is not known for smaller values of d, e.g., qubits. Also, 
there is complexity-theoretic evidence that Stoquastic Local Hamiltonian is not QMA- 
hard, though it is at least MA-hard [21] •) We show that 1-D Consistency has the same 
complexity as 1-D Local Hamiltonian, up to poly-time oracle reductions. Also, we pro- 
pose a stoquastic version of the Consistency problem, which appears to be equivalent 
to Stoquastic Local Hamiltonian, up to poly-time oracle reductions. These results sug- 
gest that, for special classes of systems. Consistency may provide an alternative route 
to solving Local Hamiltonian. (This is the approach used in the 2-RDM methods in 
quantum chemistry.) 

For these special cases, the reduction from Local Hamiltonian to Consistency uses 
the same ideas as before, but the reverse direction requires a new technique, since we 
can no longer use the machinery of QMA-hardness. We give a new reduction from Con- 
sistency to Local Hamiltonian that is based on Lagrange duality (combined with convex 
optimization using a membership oracle). The duality idea is similar to recent work 
by Hall ^41j on the "subsystem compatibility problem"; this resembles the Consistency 
problem, except that one is given density matrices describing all proper subsets of the 
system. (Thus the description of the problem is exponentially large in the number of 
qubits, and the problem is poly-time solvable, using an amount of time that is poly- 
nomial in the size of the input, but exponential in the number of qubits.) Previously, 
duality techniques have also been used in the study of entanglement, e.g., the notion of 
an "entanglement witness" |43j . 

In chapter 5, we show an interesting structural property of consistent quantum states: 
if pi, . . . , Pm are consistent with some state a y 0, then they are also consistent with 
a Gibbs state a' = (l/Z) exp(Mi + • • • + Mm)- This result was previously proved by 
Jaynes |47j in connection with the maximum-entropy principle; here we give a somewhat 
different proof, using the partition function. 

1.2 Quantum computation 

Consider a quantum mechanical system. For our purposes, the state of the system is 
described by a unit vector lip) in a vector space C^. Here we assume that the dimension d 
is finite, i.e., we do not consider systems with continuous degrees of freedom, arbitrarily 
many particles, unbounded energy, etc. We also assume that the state is pure, or 
deterministic (later we will come back to discuss mixed states). We remark that the 
complex phase of the vector \ip) is unimportant: for any G M, the vectors \xp) and 
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e lip) have the same physical meaning. We use "bracket" notation: |^) denotes a 
column vector, while {tp] denotes its adjoint, or conjugate transpose, which is a row 
vector. 

Various operations on the system are described by linear operators on C'', that is, 
complex dx d matrices. For an operator A, we define to be the adjoint, or conjugate 
transpose, of A. If the system is closed (it does not interact with an outside environment), 
then the state evolves via unitary operations: {ip) evolves to Ulip), where [/ is a unitary 
matrix, that is, = U~^. Note that this operation preserves the length of the vector 

For our purposes, a measurement is described by an observable O, which is a Her- 
mitian matrix, that is, O'^ = O. By the spectral theorem, O can be written in the form 
O = ^^Ajllj, where the Aj are distinct real numbers, and the Ilj are projectors onto 
orthogonal subspaces. Here the Aj represent the possible outcomes of the measurement: 
if the system is in state |^), then the measurement yields outcome Aj with probability 
Pi = {tjj\Yii\tl)); following the measurement, the system will be in state {1 / ■^/pi)Iii\ip) . 
Thus the expectation value of the measurement is given by 

A special kind of measurement is the following: we have an orthonormal basis 
. . . , \^d)}, and we let O = Ya=i If the system is in state IV') = 

then the measurement yields outcome i with probability |a.tp; following the measure- 
ment, the system will be in state This is called a measurement in the basis 

{|<y5l),...,|v?rf)}. 

The basic building block of a quantum computer is the qubit. This is a two- 
dimensional system, whose state is a unit vector in C^. We fix an orthonormal basis for 

which consists of two states, |0) and |1); then we can write = a\0) + 

We can construct more complex systems by assembling multiple qubits. To describe 
this, we need to define the tensor product. Let A and B be vector spaces, of dimension 
dA and ds- For any vectors a & A and b E B, the tensor product a (8> 5 is a vector of 
dimension dAds, where the tensor operation satisfies the following properties: (1) for 
any vectors a G A, b £ B, and any scalar s, we have s(a 6) = (sa) 6 = a (g (sb); (2) 
for any vectors a,a' £ A, b £ B, we have (a + a') fS> b = a <Si b + a' b; (3) for any vectors 
ae A,b,b' £ B, we have a (g) (5 + 5') = a O 5 + a (g) 5'. 

Furthermore, ^ (g) S is the vector space of dimension dAds, consisting of all linear 
combinations of tensor products, that is, all vectors of the form 

Uah{a®b). 

There is a natural inner product on this space: one defines (o(g 6, a' (g 6') = (o, a'){b, b'), 
and extends it using linearity to get 

(^Uab{a0b), ^Va'b'ia' 0b')'^ =^^UabVa'b'{a,a'){b,b'). 

a,b a' ,b' a,b a' ,b' 

Note that, if {a^*)} is an orthonormal basis for A, and {b^^^} is an orthonormal basis for 
B, then {a^*) (g b^^^} is an orthonormal basis for A(S' B. Also, given operators P and Q 
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acting on the spaces A and B, respectively, one can construct an operator P ®Q acting 
on the space A® by defining (P ® Q)[a ® h) = (Pa) (Qb), and extending it by 
hnearity. 

More concretely, one can write the tensor product a (8) 6 by taking the vector a and 
replacing each scalar entry Cj with a block consisting of the vector 0,6 (this is known 
as the Kronecker product). For example, (ai, 02)"^ (61, 62)"^ = (ai6i, 0162, 0261, a2&2)"^- 
One can write the tensor product of two matrices in a similar way. 

If we have two quantum systems, described by state spaces A and B, then the 
combined system is described by the state space Ai^B. Also, if P is a unitary operation 
or an observable for the first system, then P / is the equivalent operation for the 
combined system; likewise, if Q is an operation for the second system, the / (8) Q is the 
equivalent operation for the combined system. 

So, a system of n qubits has a state space (C^)®" = • • • ® C^, that is, a tensor 
product of n copies of C^. This is a vector space of dimension 2", with an orthonormal 
basis consisting of the vectors \z) = • • • ® \zn), z £ {0, 1}". We refer to this as the 
standard or computational basis. 

We are ready to introduce the quantum circuit model of computation. We define 
a quantum computer to be a device that can perform the following tasks on n qubits 
(using resources that scale polynomially in n): (1) prepare qubits in the computational 
basis states; (2) implement a universal family of quantum gates, which can be applied 
to any subset of qubits; (3) measure qubits in the computational basis. Here, a quantum 
gate is simply a unitary operation on a fixed number of qubits (that does not grow with 
n). We assume a fixed, finite set of gates; circuits on n qubits are built by composing 
these gates. We say that a set of gates S is universal if, for any unitary operation U, 
one can approximate U with error e by using a circuit of size 0(poly(l/e)) consisting 
of gates from S. (Note that the 0(poly(l/e)) contains a hidden constant that depends 
on U.) This implies that, for any set of gates S' , a circuit of size m consisting of gates 
from S' can be simulated with error e by using a circuit of size 0(poly(m/e)) consisting 
of gates from S. (Again, the hidden constant depends on S'.) 

For example, the following gates are a universal set: controlled-NOT (CNOT), 
Hadamard (H), phase (S), vr/S gate (T). Controlled-NOT is a two-qubit gate, while the 
others are single-qubit gates. They are defined as follows: 

^ = 71(1 -1)' ^^(0 ^^(0 e-/^)- 

Furthermore, for any unitary transformation U, the number of gates needed to approx- 
imate U with error e grows like 0(log^(l/e)), c ~ 2; this is knows as the Solovay-Kitaev 
theorem. See |68j for more details and proofs of these results. 

We remark that there are other equivalent models of quantum computation, such as 
the quantum Turing machine [16] (see also [68] for references to earlier work in this area) , 
and models motivated by possible experimental implementations of quantum computers. 
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1.3 Quantum Complexity Classes 



We define the class BQP ("bounded-error quantum polynomial time"), by analogy with 
BPP ("bounded-error probabilistic polynomial time") |16| : a language L is in BQP if 
there exists a poly-time quantum algorithm A such that 

• If X G L, then A{x) accepts with probability > 2/3. 

• If X ^ L, then A{x) accepts with probability < 1/3. 

To be precise, ^ is a uniform family of quantum circuits, of polynomial size. Similarly 
to BPP, the success probabilities can be amplified via repetition. 

The class QMA, or "Quantum Merlin- Arthur," is defined as follows |84^ [7]: a 
language L is in QMA if there exists a poly-time quantum verifier V and a polynomial 
p such that 

• If X G L, then there exists a quantum state p on p(|x|) qubits such that V{x^p) 
accepts with probability > 2/3. 

• If X ^ L, then for all quantum states p on p(|x|) qubits, V{x,p) accepts with 
probability < 1/3. 

Here, |x| denotes the length of the string x. This is similar to the definition of MA or NP, 
except that the witness is allowed to be a quantum state, and the verifier is a quantum 
circuit with bounded error probability. The success probabilities can be amplified via 
parallel repetition; see the discussion in [7j. 

Note that one can easily restate these definitions in terms of promise problems, rather 
than languages. 

We give a brief summary of the known relationships between BQP, QMA and other 
complexity classes. Definitions of the other classes can be found in |72j . 

First, it is not hard to see that BPP C BQP, BQP C QMA, and MA C QMA. 

BQP and QMA are contained in "counting" classes such as #P. In particular, BQP C 
PP ^ (see t32j for a simpler proof); a stronger result is BQP C AWPP [3"^. Also, QMA 
C PP [5l]; a stronger result is QMA C AqPP tS3j. However, these upper bounds do not 
seem to be tight. PP is quite a powerful class; note that P^^ contains the polynomial 
hierarchy PH (Toda's theorem). Also, PP seems to be much more powerful than BQP; 
note that PP = PostBQP (BQP with postselection) p]. 

Much less is known about the relationship between BQP and the polynomial hi- 
erarchy PH. (Recall that PH is the union of the classes NP, coNP, = NP^^, 
Hg = (coNP)'^^,....) We do know that, relative to a random oracle, with probabil- 
ity 1, BQP does not contain NP [15]. Since BPP and MA are contained in PH, one 
might expect that BQP would be in PH, but this is not known. 

1.4 The Local Hamiltonian Problem 

The Local Hamiltonian problem is defined as follows [53^ [7] : 
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Consider a system of n qubits. We are given a Hamiltonian H = Hi + • • • + 
Hm, where each Hi acts on a subset of qubits Cj C {1, . . . ,n} (and so has 
dimension 2l<^^l x 2l<^'l). The Hi are Hermitian matrices, with norm < 1. 
Also, each subset Cj has size |Cj| < k, for some fixed k. 

All numbers are specified with 7 bits of precision. 

In addition, we are given a string "1*" (the unary encoding of a natural 
number s), and two real numbers a and b, such that b — a > 1/s. 

The problem is to distinguish between the following two cases: 

• H has an eigenvalue that is < a, output "YES." 

• If all the eigenvalues of H are > b, output "NO." 

Note that one may have multiple terms in the Hamiltonian that act on the same 
subset; so the subsets Cj might not all be distinct. 

The string "1'^" is simply a device to ensure that the gap between the "YES" and 
"NO" cases is not too small, relative to the "size" of the problem. 

Intuitively, we think of n as the "size" of the problem, and we are interested in 
instances where k is a constant, m < poly(n), 7 < poly(n) and s < poly(n) (so the gap 
6 — a is at least 1/ poly(n)). We say an algorithm is efficient if it solves these instances 
in time poly(n). 

Formally, an instance of the problem is described by a string of length £ = 0(4'^m7 + 
s). We say an algorithm is efficient if it takes time polynomial in £. Although this formal 
definition looks different from our intuition, it is equivalent, as we will see in the next 
section. 

Finally, note that this is a promise problem: we are promised that the input is either 
a "YES" instance or a "NO" instance. 

Special cases of the problem include 2-Local Hamiltonian (where k = 2), and 2-Local 
Hamiltonian on a graph G (where k = 2, and the graph G' , consisting of vertices 1, . . . , re 
and edges Ci, . . . , Cm, is restricted to be a subgraph of G). 

Kitaev showed that Local Hamiltonian is in QMA, and the case of /c = 5 is QMA- 
hard \53\ [7]. With greater effort, one can show that 2-Local Hamiltonian is also QMA- 
hard [521 I51j . These hard instances of Local Hamiltonian do have the property that 
m < poly(re) and s < poly(re). 

(This is a slight abuse of notation, because QMA is a class of languages, whereas 
Local Hamiltonian is a promise problem.) 

1.5 Promise Problems and Polynomial Time 

In the previous section we considered two notions of what it means to solve the Local 
Hamiltonian problem efficiently. Assume k is constant, so an instance of the problem is 
described by a string of length i = 0(m7 + s). Intuitively, we believe an algorithm is 
efficient if, on instances where m < poly(re), 7 < poly (re) and s < poly (re), the algorithm 
takes time poly(n). Formally, we say an algorithm is efficient if, on all instances, it takes 
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time poly(^). We now show that, under some mild conditions, these two notions axe 
equivalent. 

We say that Local Hamiltonian is polynomial-time solvable if: 

There exists an algorithm A and a polynomial t, such that on all instances, 
A returns the correct answer in time t{i). 

Let {S) denote the following statement, which corresponds more closely to our intu- 
ition: 

There exists an algorithm A and a polynomial t, and there exist constants 
a,P>0, such that for any instance that satisfies m,^,s < an^, A returns 
the correct answer in time t{n). 

Statement (S) asserts that, for some specific bounds on the size of m, 7 and s as 
a function of n, the algorithm A runs in time poly(n). These bounds can be very 
restrictive, for instance, they may be sublincar in n. Thus {S) appears to be a weaker 
condition, because it does not say anything about the running time for other values of 
m, 7 and s. 

Obviously, if Local Hamiltonian is poly-time solvable, then (S) holds. We will show 
the reverse implication, using a padding argument: Suppose that (S) holds. We will 
construct a modified algorithm A that solves arbitrary instances of Local Hamiltonian. 
A that takes an instance x, modifies it by adding extra "dummy" qubits to the problem, 
thus increasing n until it satisfies the promises stated in condition {S), and then runs 
algorithm A. On an input of length £, algorithm A takes time 

max{t(n),t((m/a)V/3),t((7/a)V/3),t((,/a)V/3)} < poly(£), 

hence Local Hamiltonian is poly-time solvable. 

So statement (S) is equivalent to poly-time solvability. So we can use either of these 
notions; it turns out that the latter one is more convenient and less cumbersome. Similar 
arguments apply to other promise problems. 

1.6 Density Matrices 

Consider a system of n qubits. Up to this point we have dealt with pure states, which 
arc represented by vectors in . However, one may also encounter mixed states, 
which arc ensembles of pure states, where each state \ipi) appears with some probability 
Pi. (For simplicity we assume a discrete ensemble {\ipi)}; continuous ensembles can be 
treated in a similar way.) It turns out that a mixed state is represented by a density 
matrix, which is a positive semidefinite matrix on with trace 1, defined by 

p = ^Pi\A){ti\- 

i 

In particular, a pure state is represented by the density matrix Also, if we 

have an ensemble where each element is a mixed state pi, which appears with probability 
Pi, then the ensemble is described by the density matrix p = YliPiPi- 
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Interestingly, it is possible for two seemingly different ensembles to be represented 
by the same density matrix. For instance, an equal mixture of |0) and |1) yields the 
same density matrix as an equal mixture of |+) = -^(lO) + 11)) and |— ) = "^(|0) — 
Quantum mechanics asserts that all of the physically accessible information is contained 
in the density matrix. So in cases like this, the two ensembles cannot be distinguished 
by an observer. 

One can reformulate the basic facts of quantum mechanics, using density matrices 
instead state vectors. A unitary operation U transforms a density matrix p to UpU^ . 
If we measure an observable O = ^^Ajllj, we get outcome Aj with probability pi = 
tr(nj/3); following the measurement, the system will be in state {l/pi)lliplli. Thus the 
expectation value of the measurement is given by tr(0/o). In particular, if we measure in a 
complete orthonormal basis {\{pi), . . . , \ (pd)}, we get outcome i with probability {(pi\p\ipi) 
(these are simply the diagonal elements of p in the basis {\ipi), . . . , \ ^d)})', following the 
measurement, the system will be in state \ipi){ipi\. Finally, if two quantum systems A 
and B are in states pA and ps, then the combined system is in state pa® Pb- 

Density matrices are a convenient tool for describing subsets of a quantum sys- 
tem. Here the situation is more complicated than in the classical world, because of 
the phenomenon of entanglement. For example, consider the following two-qubit state, 
l^"*") = -^(lOO) + 111)). This is a pure state, and in the classical world, that would imply 
that the two individual bits were pure (i.e., deterministic), and uncorrelated. But for 
this quantum state, even though the overall state is pure, the two individual bits are 
mixed (they can be either or 1), and correlated (they are always equal). In fact, for 
a quantum state, it is possible for a subset of the system to have higher entropy than 
the system as a whole. These unusual effects are caused by entanglement; see [68j for a 
further discussion of entanglement and its applications to quantum computation. 

A subset of a quantum system is described by a reduced density matrix. Say we have 
two quantum systems, with state spaces A and B. Let p be the state of the combined 
system, i.e., a density matrix p on the space A (^i B, where p is not necessarily of the 
form fj (8> r. Let {|ai), . . . , \ad)} be a basis for A, and let . . . , \bd')} be a basis for 

B. Then {|aj) \bi')} is a basis for yl ® i?, and we can write p in the form 





Then the subset A is described by the reduced density matrix 



p^^^=iVB{p). 



Here we define the partial trace over B by 
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This is also called "tracing over B." Intuitively, it is like computing a marginal proba- 
bility distribution, by summing over all possible values oi B. It can be shown that the 
result does not depend on the choice of basis for B. Furthermore, for any observable O 
on the subsystem A, one can show that measuring O with the reduced state p^^^ yields 
the same outcomes as measuring O ® I with the original state p. 

Finally, we introduce the Pauli matrices, which are a useful tool for working with 
density matrices. Let X, Y and Z denote the Pauli matrices for a single qubit, 

and define V = {I , X,Y, Z}. We can construct n-qubit Pauli matrices by taking tensor 
products P = Pi(g)---(g)Pne P®". 

Any 2"-dimensional Hermitian matrix can be written as a real linear combination 
of n-qubit Pauli matrices. Furthermore, the n-qubit Pauli matrices are orthogonal with 
respect to the Hilbert-Schmidt inner product: ir{P^Q) = 2" if P = Q, and otherwise. 
So, if a is an n-qubit state, we can write it in the form 

where the coefficients are uniquely determined by ap = tr (Per); note that these are the 
expectation values of the Pauli matrices P. This application of the Pauli matrices is 
closely related to quantum state tomography. 

One can also write a reduced density matrix a^^\ where A C {1, . . . , n}, in terms of 
the Pauli matrices. We say that a Pauli matrix P is supported on the set A if, for all 
i ^ A, Pi = I . Also, define the restriction of P to A, P\a = <^i£APi- 

The partial trace acts on P as follows: trj]^^ (P) = 2'^-I^IPU if P is supported 

on A, and otherwise. Thus we have 

f^'^^ = tr{i,...,n}-A(f^) = ^ Yl ap^U- 

P supported on A 

In other words, the information contained in o"'"^] is precisely the expectation values of 
those Pauli matrices P that are supported on A. 

We state a few definitions from matrix analysis [18j. For a vector v £ C", we define 
the £2 and £1 norms, 

h\\ = hh = ivii"^)^^"^, \\v\\i = ^ i^ii. 

i i 

For a matrix A G C"^", we let denote the conjugate transpose, and \A\ = \/ A. 
We define the sup, £2 and £\ norms, 

Pll = sup ||ylv||, PII2 =tr(^t^) = ^|^.^.|2^ P||i=tr|^|. 
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1.7 Consistency of Local Density Matrices 



We define the Consistency problem as follows [5j: 

Consider a system of n qubits. We are given a collection of local density 
matrices pi, . . . , pm, where each pi acts on a subset of qubits Cj C {1, . . . , n} 
(and so has dimension 21*^*1 x 2l*^*l). Each subset Cj has size \Ci\ < k, for 
some constant k. 

All numbers are specified with 7 bits of precision. 

In addition, we are given a string "1*" (the unary encoding of a natural 
number s), and a real number /3, such that /? > 1/s. 

The problem is to distinguish between the following two cases: 

• There exists an n-qubit state a such that, for all i, tv^i^ ^nj-di'^) = Pi- 
In this case, output "YES." 

• For all n-qubit states a, there exists some i such that ||tr|i^ ,j|_(7. (cr) — 
Pi 111 > (3- In this case, output "NO." 

Without loss of generality, we can assume that the subsets Cj are all distinct; thus 
< (^) ^ n''. As in the Local Hamiltonian problem, the string "1*" is simply a device 
to ensure that the gap between the "YES" and "NO" cases is not too small, relative to 
the "size" of the problem. Here, we use the norm ||^||i = tr \A\ to measure the distance 
between pi and the corresponding reduced density matrix of a. When multipled by 1/2, 
this is the trace distance. 

An instance of this problem is described by a string of length £ = G(4'^m7 -|- s). We 
say that an algorithm is efficient if it takes time poly(^). The remarks made earlier about 
polynomial-time solvability of Local Hamiltonian apply to this problem as well. We will 
be interested in instances where A; is a constant, m < poly(n) (see above), 7 < poly(n) 
and s < poly(n); these instances are described by strings of length i < poly(n). 

An important special case is where k = 2. We can visualize the system as a graph 
with nodes 1, . . . , n and edges given by the subsets Ci, . . . , Cm- 



^Here the equality holds up to 7 bits of precision. 
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Chapter 2 



Consistency of Local Density 
Matrices is QMA-complete 

2.1 Introduction 

Quantum mechanical systems exhibit many unusual phenomena, such as coherent super- 
positions and nonlocal entanglement. It is interesting to compare this with the behavior 
of classical probabilistic systems. In a classical system, such as a Markov chain or a 
graphical model, one may have correlations or dependencies among different parts of 
the system; in particular, local properties can affect the joint probability distribution of 
the entire system. Many quantum systems have a similar flavor, though their behavior 
is more complicated. In this paper, we investigate one problem of this kind, and its 
relationship to the complexity class QMA. 

First, consider a classical problem. Suppose we have random variables Xi,..., 
Xfi, with some unknown joint distribution D, and we are given marginal distributions 
Di, . . . , Dm, where each Di describes a subset Cj of the variables. (We assume that the 
random variables Xj take on values in some fixed finite set, and the subsets Cj have size 
at most some constant k.) Does there exist a joint distribution D that matches each of 
the marginals Di on the subsets Cj? If so, we say that the marginals Di are "consistent." 

Deciding the consistency of marginal distributions is NP-hard, by a straightforward 
reduction from 3-coloring. (We are given a graph G = (y,E). For each vertex v € V, 
construct a random variable X^ which takes on values in {r, g, b}. For each edge {u, v) £ 
E, specify that the marginal distribution of X^ and X^ must be uniform over the set 
{r,g,b}'^ \ {rr, gg,bb}. These marginals are consistent iff G is 3-colorable.) 

Now consider the generalization of this problem to quantum states. (This problem 
was first suggested to me by Dorit Aharonov, in connection with the class QCMA [5j.) 
Suppose we have an re-qubit system, and we are given local density matrices pi, . . . , pm, 
where each pi describes a subset Cj of the qubits. Does there exist a global state a on 
all n qubits that matches each of the local states pi on the subsets Cj? If so, we say that 
the local states pi are "consistent." 

We will show that this problem is QMA-complete, where QMA is the quantum 
analogue of NP. QMA is the class of languages that have poly-time quantum verifiers. 
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where the witness is allowed to be a quantum state. QMA arises naturally in the study 
of quantum computation, and it also has a complete problem, Local Hamiltonian, which 
is a generalization of fc-SAT [531 17]. 

Our result is interesting, because we only know of a few QMA-complete problems, 
and most of them look like universal models of quantum computation. For instance, the 
fact that Local Hamiltonian is QMA-complete [53\ O [521 \5T\ [69] is closely related to the 
fact that adiabatic quantum computation is equivalent to the standard quantum circuit 
model [9j. Other QMA-complete problems such as Identity Check involve properties of 
quantum circuits [l6]. The Consistency problem, however, does not seem to embody 
any particular model of quantum computation; this will become clearer when we present 
our reduction from Local Hamiltonian. 

Why are there so few QMA-complete problems, when there is such an astonishing 
variety of NP-complete problems? The reason seems to be that the techniques used to 
show NP-hardness, such as mapping reductions using combinatorial gadgets, break down 
when we apply them to a "quantum" problem like Local Hamiltonian. For instance, to 
reduce Local Hamiltonian to the Consistency problem, we would try to use local density 
matrices to "simulate" local Hamiltonians. But we run into problems due to the presence 
of non-commuting matrices. (In cases where quantum gadgets do work, such as j51[ I69j. 
they are much more subtle than classical gadgets.) 

Instead, our proof that the consistency problem is QMA-hard uses a poly-time oracle 
reduction from Local Hamiltonian. The basic idea is that Local Hamiltonian can be 
expressed as a convex program in polynomially many variables, which can be solved 
using convex optimization algorithms, given an oracle for the Consistency problem. In 
particular, we use a class of convex optimization algorithms [88l |38l [171 HSl |80] that 
only require a membership oracle, rather than a separation oracle. We also use a simple 
representation of the local density matrices in terms of the expectation values of Pauli 
observables. 

Note that the Consistency problem has a rather different structure from Local Hamil- 
tonian. For instance, a local density matrix contains complete information about the 
local state of the system, whereas in many cases a local Hamiltonian only constrains the 
local state of the system to lie within a certain subspace. 

Finally, we remark that our reduction from Local Hamiltonian to Consistency pre- 
serves the "neighborhood structure" of the problem, in that the local density matrices 
act on the same subsets of qubits as the local terms in the Hamiltonian. So, using the 
QMA-hardness results for 2-Local Hamiltonian [51] and Local Hamiltonian on a 2-D 
square lattice ^6^, we can immediately get QMA-hardness results for the corresponding 
special versions of the Consistency problem. 

We also mention some related work. In [24j, one considers the Common Eigenspace 
Problem, verifying the consistency of a set of eigenvalue equations Hi\tp) = where 
the operators Hi commute. We do something similar, translating each local density 
matrix into constraints on the expectation values of Pauli matrices, though in our case 
the Pauli matrices do not commute. Also, in [20], one considers a quantum analogue of 
2-SAT, where we seek a state whose local density matrices have support on prescribed 
subspaces. However, this problem is more closely related to Local Hamiltonian than to 
Consistency, since the constraints can be written in the form Hjl^/)) = where the Hj 
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are local projectors. 

After our result was published, we became aware of some related work by Gurvits 
|40| . who used convex optimization with a membership oracle to show NP-hardness of 
the separability problem for quantum states. Also, an older paper by Grotschel et al [37| 
used a simpler tool, convex optimization with a separation oracle, to show NP-hardness 
of weighted fractional chromatic number. 

This chapter is organized as follows. First, we show that Consistency is in QMA. 
Then we develop the technique of convex optimization with a membership oracle. We go 
into considerable detail, because we will use this tool in the following chapters as well. 
One particular contribution is to give algorithms for "approximate" convex optimization, 
where one is allowed to make additive errors of size l/poly(n); these algorithms are 
much simpler than the algorithms of \88\ l38l ITTl B9] . Finally, we show that Consistency 
is QMA-hard, by a reduction from Local Hamiltonian. 

2.2 Consistency is in QMA 

Theorem 2.1 Consistency is in QMA. 

Proof sketch: The basic idea is as follows. Given a witness state a, the verifier will pick 
a subset Ci at random, and perform measurements to compare a (on the subset Cj) to 
Pi. There is a complication, however, because the verifier requires many independent 
copies of the witness a, and a dishonest prover might try to cheat by entangling the 
different copies. In spite of this, one can show that the verifier is still sound, using a 
Markov argument. This argument is due to Aharonov and Regev, who used it to give 
an alternative definition of QMA, known as QMA-I- [8j. Using the QMA-I- definition, 
one can easily see that Consistency is in QMA. For the sake of clarity, however, we will 
explicitly construct a QMA verifier for Consistency. 
The verifier works as follows: 

Set e = (l/2)(/?/4'^) and r = (16/e^)ln(8 • 4^m/e). (These are polynomially 
related to the length of the input.) 

Given a witness r, which is a quantum state on rn qubits. (We view this as 
r registers, each consisting of n qubits.) 

Choose i £ {1, . . . , m} at random. Choose a Pauli matrix Q £ (acting 
on the subset Ci) at random. 

Perform the following measurements on r: for j = l,...,r, measure the 
observable Q on the j'th register, and let Xj G {1, —1} denote the result0 

Compute Y = {l/r)J2'j=i ^j- If 1^ - tr(Qpi)| < e, then output "YES"; 
otherwise, output "NO." 

^One can measure Q using the following procedure: introduce an ancilla qubit in the state ]0), apply 
a Hadamard gate on the ancilla, apply Q controlled by the ancilla, apply another Hadamard gate on 
the ancilla, and then measure the ancilla in the 0/1 basis. The "0" and "1" measurement outcomes 
correspond to the +1 and —1 eigenvalues of Q. 
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Suppose we have a "YES" instance of Consistency, i.e., there exists an n-qubit state 
cr such that, for all i, tr|x^...^„}_c7. (cr) = pi. Then the correct witness is r = cr®''. For all 
choices of i and Q, the random variables Xi, . . . ,Xr are i.i.d., with expectation value 
E{Xj) = tr(((5 (g) /)(t) = tic{Qpi). We use the Chernoff bound. The following lemma can 
be derived from [B7], and gives a simple but not especially tight bound. 

Lemma 2.2 Let Xi, . . . , Xn he independent, 0-1-valued random variables, with E{Xi) = 
Pi, < Pi < 1. Let X = Yl^=i -^i' '^''^d let p = E{X) = Yll=iPi- Then, for all 5 <1, 



Pr 



Pr 



n n 

X p ^ 

— > - + 5 

n n 



< e 



< e-'"^l\ 



Hence 



Pr[|y-tr(Qp,)| >e] < 26"^ ''/^^ 

So the verifier rejects with probability < 2e~^'^'^l^^ = (l/4)(e/4'^m-). 

Now suppose we have a "NO" instance of Consistency, i.e., for all n-qubit states a, 
there exists some i such that ||tr|i^ „j_(7. (a) — Pilli > /3. We claim that, for any witness 
state r, the verifier rejects. 

Let T^-') denote the reduced state for the j'th register, and define r* = (1/r) 
^^j^Y ''"^"'^ • '^'^^ significance of this state comes from the following two observations: 

^(XjO =tr((Q0/)r(j')), 



E{Y) = (1/r) ^ E{X,) = tr((Q ® L)t*) 
i=i 



We know there exists some i such that lltr 



{l,...,n}-C,r 



Pi 111 > We can write 



ti-{i,...,n}-c,(^*) - Pi 



By the triangle inequality, 



^ Uv{{Q^L)T*)-ti{Qp,)]Q 



Q(z-p«\C,\ 



l|tr{l,..„n}-C,(T*) - P^\\l < Yl ® ^)^*) - ^""^QPi^ 

hence there exists some Q such that | tr((Q ® /)t*) - tr(Qpj)| >/?/4l'^»l. 

So, with probability > 1/4*^771, we will choose some i and Q such that \E{Y) — 
tr(Qpi)| > /3/4^' = 2£. We now use a Markov argument to lower-bound the probability 
that the verifier rejects. First, consider the case where E{Y) < tr{Qpi) — 2e. The verifier 
will accept only if y > E{Y) + e. Define Z = Y + 1 > 0. By Markov's inequality. 



Pr[Z > E{Z) +e]< 



EjZ) 
E(Z)+e 



E{Z)+e 



< l-e/2. 
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Hence, the verifier rejects with probabihty > (l/2)(e/4'^m). 

Now consider the case where E{Y) > ti{Qpi) + 2e. The verifier will accept only if 
Y < E{Y) - e. Define Z = -Y + 1 > 0. By Markov's inequality, 

Pr[Z > E(Z) + .1 < = 1 - < 1 - e/2. 

Hence, the verifier rejects with probability > (l/2)(e/4^m). 

The gap between the probability that the verifier rejects on a "NO" instance and 
the probability that the verifier rejects on a "YES" instance is > (l/4)(e/4^m). This 
gap is inverse polynomial in the size of the input, and it can be amplified via parallel 
repetition. □ 



2.3 Convex Optimization using a Membership Oracle 

Convex optimization is the problem of minimizing a convex function / subject to con- 
vex contraints, i.e., let K be the set of feasible solutions (which is convex), and find 
some X ^ K that minimizes f{x). Convex optimization includes linear programming 
and semidefinite programming as special cases, and has numerous applications in oper- 
ations research, statistics and other areas [TTJ. Many algorithms are known for convex 
optimization. On one hand there are general methods such as the ellipsoid algorithm, 
which solve general convex programs and are theoretically (if not practically) efficient. 
There are also interior-point methods, which typically work on special classes of convex 
programs (e.g., linear or semidefinite programs), and are efficient in practice. 
We will be concerned with convex programs of the following form: 

Let K C M" be a convex set specified by a membership oracle, i.e., given a 

point X, the oracle tells us whether or not x is \n K. 
Assume that K contains a ball of radius r around a known point p, and K 

is contained within a ball of radius R around the origin. 
Let / : M" — > M be a linear function, which is efficiently computable. 
Find some x £ K that minimizes f{x). 

These programs are quite challenging to solve, because we do not have an explicit de- 
scription of the convex constraints; we only have an oracle that tells us whether or not 
a proposed solution is feasible. Moreover, when the solution is not feasible, the oracle 
does not give us any additional information (such as a violated constraint or a separating 
hyperplane) that could help us fix the solution. (However, we at least have a starting 
point p which is feasible.) 

Remarkably, there are algorithms that solve these convex programs in polynomial 
time. The first such algorithm was the shallow-cut ellipsoid method, due to Yudin and 
Nemirovskii [88^ I38j : recently a different algorithm based on random walks in convex 
bodies was devised by Bertsimas and Vempala 117\ I49|. These algorithms even give "ex- 
act" solutions, in the following sense: if the membership oracle can resolve the boundary 
of the set K with precision ±6 (for any 6) while taking time poly(n, log(l/(5)), then the 
algorithm can find the optimal solution with precision ibe (for any e) while taking time 
poly(n, log(i?/r), log(l/e)). 
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Our problem is a little different, however. We are given a weaker membership oracle, 
that runs in time poly(n, (1/6)). But our goal is also more modest: we desire an algo- 
rithm that finds the optimal solution in time poly(n, (R/r), (1/e)). We refer to this as 
"approximate" convex optimization. (Intuitively, in the "approximate" setting, we are 
promised that S and e are at least 1/ poly(n), and R/r is at most poly(n). (For more 
discussion of what it means to solve a gap promise problem in polynomial time, see chap- 
ter 1.) Note the contrast with the "exact" setting, where 6 and e may be exponentially 
small, and R/r may be exponentially large.) 

In addition, we care about some other aspects of the algorithm. We will eventually 
use this to give a reduction from Local Hamiltonian to Consistency; hence the run- 
ning time is less important (so long as it is polynomial), but we are interested in the 
relationship between S and e, i.e., for a given value of e, how small does 5 have to be. 

It turns out that the "exact" algorithms mentioned earlier can be adapted to the 
"approximate" setting. But in fact there are much simpler algorithms in the "approx- 
imate" setting, for which the relationship between 6 and e is just as good, though the 
running time is larger. In this section we will describe one such algorithm in detail, and 
then sketch some of the other more sophisticated methods. 

Now we will define the problem more precisely. We take a similar approach to [38], 
though there are some differences which we will discuss presently. First, some notation: 
let S(j), r) denote the closed ball of radius r around the point p, 

S{p, r)={xe M" I \\x - p\\ < r}. 

Also, for any set K, we define the ball of radius e around K, 

S{K,e) = {x G M" I there exists y e K s.t. \\x - y\\ < e}, 

and we define the interior of K with radius e, 

S{K, -e) = {xe M" I S{x, e)^K}. 

Let K he a closed convex set in M", and suppose we are given a point p E M", and 
inner and outer radii r, G M, such that S{p,r) C C 5(0, -R). (This implies that K 
is bounded and full-dimensional.) We want to show a reduction from the problem of 
optimizing a linear function over K, to the problem of deciding membership in K. 

In the following sections, we will represent real numbers with F bits of precision; 
arithmetic operations will then take time poly(F). (Usually, we will have F = poly(n).) 

We define the weak optimization problem WOPTf, as follows: (The adjective "weak" 
refers to the fact that we allow additive errors of size e.) 

Given c G M", ||c|| = 1, 7 G M, and e G M, e > 0. 

If there exists a vector y G S{K, —e) with c • y > 7 + e, then answer "YES." 
If for all X G S{K,e), c • x < 7 — e, then answer "NO." 

We have formulated this as a decision problem, rather than a search problem, because 
this suffices for our application. (This is different from the convention used in [38j, where 
WOPT refers to the search problem, and WVAL is the decision problem. However, the 
same reductions hold true for both WVAL and WOPT.) 

We define the weak membership problem WMEMs as follows: 



18 



Given y £ M", and 5 G M, 5 > 0. 

If y G S{K, -6), then answer "YES." 

If y ^ S{K,6), then answer "NO." 

We also define the weak separation problem WSEPg as follows: 

Given y £ M*^, and 5 G M, 5 > 0. 
Uy e S{K,-5), then answer "YES." 

If y ^ S{K,5), then return a vector c G M", ||c|| = 1, such that for every 
X G S{K, —5), c • X < c ■ y + 5. 

This problem is similar to the membership problem, except that when y lies outside of 
K, one is asked to find a hyperplane that separates y from K. This problem serves as 
an intermediate step in the reduction from WOPT to WMEM. 

Finally, we define special versions of these problems that capture the notion of "ap- 
proximate" convex optimization. We define WOPTii^oij in the same way as WOPT^, 
except that the input now includes a unary string "I**," such that e > 1/s. Intuitively, 
this amounts to a promise that e is at least inverse-polynomial in the length of the input. 
In a similar way, we define WMEMii^^Xy a-iid ^ SEPij^ohf- Also, when we deal with 
these problems, we will often assume that R/r < poly(n). 

There are a few differences between our definitions and the ones in [38j. We construct 
gap promise problems, where the input is promised to fall under one of two (disjoint) 
cases, and the algorithm must answer "YES" or "NO" accordingly. [38] uses a different 
style, where the algorithm must assert either "^ is true" or "i? is true"; on every input, 
at least one of them is true, however it is also possible for both A and B to hold 
simultaneously. In fact this formulation is equivalent to a promise problem, where "A 
and not B" and "S and not A" are the two disjoint cases, which the algorithm must 
distinguish. 

Also, unlike here, [38] does not make any assumptions about how many bits of 
precision are used to specify the input; they show that the running time is polynomial 
in the length of the input, which is not necessarily polynomial in n. Our setting, where 
the input has poly(n) bits of precision and the running time is poly(n), can be viewed 
as a special case. 

Our main result is the following: 

Theorem 2.3 Let K he any closed convex set in M", such that S{p,r) Q K CI S{0,R), 
as defined above. Suppose R/r < poly(n). Then there is a poly-time oracle reduction 
from WOPT,/p,iy to WMEMi/p,iy. 

We will prove this theorem in the following sections. (We will also give more detailed 
bounds on the various parameters.) The techniques used for "exact" convex optimization 
|38j can be adapted to our "approximate" setting. However, one can give other, simpler 
reductions in the "approximate" case — in particular, one can do away with the ellipsoid 
method entirely. The general picture is as follows: 

In the "exact" setting [38] , one can reduce WOPT to WSEP using the central-cut 
ellipsoid method. It is not known whether one can reduce WSEP to WMEM, but one 
can reduce WOPT to WMEM via the shallow-cut ellipsoid method. 
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In the "approximate" setting, one can give similar reductions. This is because the 
above algorithms have the property that, when R/r is at most polynomial, e and 5 
are polynomially related. Alternatively, one can reduce WOPTij^oiy to W SEPij^^Xy 
using a simple perceptron-like algorithm. Furthermore, one can reduce VFS'E'Pi/poiy to 
WMEMi/pf^iy, using a clever non-ellipsoidal algorithm (this was actually a preprocess- 
ing step in the shallow-cut ellipsoid method). Combining these steps gives a simpler 
reduction from WOPTi/p^ij to WMEMii^^iy, for which the relationship between e and 
5 is just as good, but the running time is larger. 

Finally, there are the algorithms based on random walks \n\ . These are notable 
for a couple of reasons. First, they can solve convex programs where the objective 
function / is not linear. Roughly speaking, one needs a membership oracle for the set 
K, and a separation oracle for the level sets of / (which one could obtain by computing 
the gradient of /). We will not need this extra degree of generality here. 

Second, these algorithms have a simple error-tolerance property, which is quite dif- 
ferent from the ellipsoid method. The intuition is as follows. These algorithms work by 
performing a random walk inside the set K, which converges to the uniform distribu- 
tion. The membership oracle makes mistakes near the boundary of K. If this "boundary 
layer" is sufficiently thin, then its volume will be small compared to the total volume of 
K, and so with significant probability, the random walk will never visit that part of the 
set. 

These random-walk algorithms might in some cases achieve a better relationship 
between e and 5, compared to the shallow-cut ellipsoid method. It would be interesting 
to carry out this analysis in detail. 

2.3.1 A Simple Reduction 

We will give a simple reduction from WOPT to WMEM in the approximate setting. 

Note: All calculations are done with poly(n) bits of precision. However, in order 
to give a more streamlined exposition, in this section we assume that all arithmetic 
operations yield exact results. Later, in section 12.3.21 we will analyze the effect of 
round-off errors. 

We present the reduction in several steps. First, consider a variant of the weak 
membership problem with 1-sided error (call it W MEM'^): 

Given y G M", and 5 G M, 5 > 0, all specified with poly(n) bits of precision. 
Distinguish between the following two cases: 
liyeK, then answer "YES." 
If y ^ S{K,5), then answer "NO." 

Lemma 2.4 (This is Lemma 4-3.3 in 138].) There exists an algorithm A and a poly- 
nomial t, such that for any convex set K with parameters {n,R,r,p) as defined above, 
and for any S > 0, there exists 5' > r5/AR, such that A{{n, R,r,p), . . .) is an oracle 
reduction from WMEM^ to WMEMs' , which runs in time t{n,log{R)). 

Proof: The algorithm A is as follows: 
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Given {n,R,r,p) as defined above, y £ M", 6 > 0. 
If \\y-p\\ > 2R, then answer "NO." 

Run the WMEMs^ oracle on the point y' = (1 - 6/4:R)y + {6/4:R)p, 
and return the answer given by the oracle. 

The analysis is straightforward; see ^38j for details. □ 

Next, consider a variant of the weak separation problem with parameter f3 (call this 
WSEP^y. 

Given a point y € M", < 5 < 1, and < /? < 1, specified with poly(?i) bits 

of precision. 
If y G S{K, -6), answer "YES." 

If y ^ S{K,6), return a vector c G M", ||c|| = 1, such that for every x £ K, 
c ■ X < c • y + 6 + (3\\x — y\\. 

Intuitively, we now have a weaker form of separation: instead of a separating hyperplane, 
we have a cone with slope (3. Points x € K that are far away from y can violate the 
inequality c-x<c-?/ + 5byan amount proportional to ||x — 

Lemma 2.5 (This is Lemma 4-3-4 in l38].) There exists an algorithm A and a poly- 
nomial t, such that for any convex set K with parameters {n,R,r,p) as defined above, 
and for any < 6 < I and < (3 < 1, there exists e > /3^r^(5/128n^i?^, such that 
A{{n, R,r,p), . . .) is an oracle reduction from WSEP^ to WMEM^ , which runs in 
time t{n, (1//3), log(i?/r), log(l/(5)). 

Proof: The algorithm is as follows: 

Given {n,R,r,p) as defined above, y G M", 0<(5<1, 0</3<l. 
Run the WMEM^ oracle on the point y. If the oracle answers "YES," then 
return "YES." 

Define 6i = -j^S, n = -^Si, e = ei = jf^ri, and a = arctan(/3/4n2). 

Do binary search to find two points v and v' on the line segment connecting 
y and p, such that v is closer to y, v' is closer to p, the WMEM^ oracle 
answers "NO" at v and "YES" at v' , and ||?; — t;'!! < 5i/(2n). Then define 
^" = ~ ri)v' + (ri + ei)p). Translate the coordinate system so 

that v" = 0. 

Repeat the following procedure: 

Let H be the (n — l)-dimensional hyperplane perpendicular to v and 
containing the point (cos^ a)v. Let vi, . . . ,Vn be the vertices of a regular 
simplex in H, centered at {cos'^a)v, such that for all i = l,...,n, the 
angle between Vi and v equals a. (Note that = (cos a)||f ||.) 
Run the WMEM^ oracle at each of the points vi, . . . , Vm- If the oracle 
returns "NO" on some of the Vi, then choose one such Vi, set v := Vi 
(replacing the previous value of v), and go back to the beginning of the 
loop. 

If the oracle returns "YES" on all of the Vi, then break out of the loop, 
and return the vector c = 
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The analysis of this algorithm is rather intricate. We will sketch the general ideas; details 
can be found in [38]. 

First, we run the WMEM^ oracle on the point y. If this is a "YES" instance of the 
problem, then we are done. If this is a "NO" instance of the problem, then we proceed 
to the remainder of the algorithm; furthermore, we can conclude that y ^ K. 

Note that the angle a is defined by a right triangle with side lengths -y/ri and ^/el^■ 




The binary search produces two points v and v' such that \\v — v'\\ < Si/{2n), v ^ K 
and v' G S{K,ei). We construct a point v" that satisfies S{v",ri) C K (this can be 
seen by a duality argumenlH), and \\v — v"\\ < 6i/n. When we translate the coordinates 
so that v" = 0, we get that 5(0, n) C K and \\v\\ < 6i/n. 

Next we do an iterative procedure that continues until it finds a simplex vi, . . . , 
Vn G S{K, El), where the simplex was constructed from a vector v ^ K. Let p denote the 
number of iterations; we can upper-bound it as follows. Note that with every iteration, 
\\v\\ decreases by a factor of (cosq). Initially, \\v\\ < Si, and the loop must terminate as 
soon as ||?;|| < n, since 5(0, n) C K. Then p must satisfy the inequality (cosa)^^! > ri. 
This implies 

^ log(ri/(5i) ^ log((^iAi) 
log(cos a) log(l/ cos a) 
Observe that log{6i/ri) = log{AnR/r), and 

log(l/ cos a) = — I log(l — sin^ a) 

> ^ (sin^ a) [since log(l + x) < x for all x] 
1 /32 



2 /32 + I6n4 



[by the definition of a] 



2 17n4 

Hence the number of iterations p is at most poly(n, log(i?/r), (1//3)). 

We claim that c = has the desired property, namely that for all x £ K, 

c-x <c-y + 6 + P\\x-y\\. (2.1) 

Consider the following simpler statement, that for all 2; G K, 

c-x < +(5i. (2.2) 

First, we show that (j2.2p implies (j2.ip . Given some x £ K, consider the point 



X = (x — v) = x -\ (—VI- 

R + r^ ^' R + r R + rR^ ^' 

We claim that x' £ K. Geometrically, the picture is as follows: 



^Take any d £ R" and 7 £ R such that ||d|| = 1 and all x £ A" satisfy d ■ a; < 7. Then observe that 
d • u' < 7 + £1 and d ■ p < 7 — r . This implies d- v" < 7 ~ ri . 
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The vector x' is proportional to x — y, and is a convex combination of x and {r / R){—y). 
{r / R){—y) lies along the line py. In our picture, y is on the right of v" = 0, and p is on 
the left. So {r/R){—y) is on the left of v" = 0. Also, without loss of generality, \\y\\ < R, 
so {r/R){—y) is on the right of p — r(y/||y||). Hence, by convexity, {r/R)(—y) G K, and 
this implies x' G K. 

Now substitute x' into (j2.2p : this yields (|2.ip . as desired. 

Next, we will show that l\2.2h holds, i.e., that for all x £ K, 



Define v'^ = ^^q^fj. Observe that v'i,...,v'^ £ K (this follows from the fact that 
'S'(0, ri) C K and a duality argument). Also note that ^^'^^^ = cos^a. Define w = 
(1/n) X]r=i ^i' note that w = where we define 7 = cos^ a. 

We write x in the form x = Au + n, where u-v = 0. Notice that c-x = -x = A||f ||; 
also recall that < 5i/n. If A < 1, then the claim follows easily. However, if A > 1, 
we need a more clever argument. 

If A > 1, then the geometric picture is as follows: 



c-x < P\\x\\ + 61. 




z 
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We draw a line through x and v. This hne intersects the hyperplane w + at some 
point; call this point z. We will make the following argument. Since x £ K and v ^ K, 
we know that z ^ K. Thus, within the hyperplane w + v^, z cannot lie within the 
simplex generated hy v'l, . . . , v'^. Thus z must be far from w, so x must be far from Xv, 
which implies that ||a;|| is large and A is relatively small. 

This can be made precise as follows (see [38] for the step- by-step details). We can 
rewrite x = Xv + u m the form 

7 — 1 7 — 1 A — 7 

+ u = X + V. 

' A- 1 A-1 A-1 

Then we set z equal to either side of this equation. Using the above geometric argument, 
we deduce a lower bound on Hz — t^H, 

\\z — w\\ > {l/n)\\v[ — w\\ = (l/n)(tana)||w||. 

From the definition of z, we have that u = ^E:j{z — jv). Also recall that w = jv. Hence 

ll^^ll > 1 T7(l/^)(taii a)7||f ||. 

|7 - 1| 

After some manipulation, this yields the bound 

||n||>(A-l)||z;||^, 

which implies 

A-i<l4-<l4-. 

\\v\\ n \\v\\ n 

Substitute this into c - x = ■ x = X\\v\\ and the claim follows. 

Note: the reader may have noticed that we proved an inequality that is stronger than 
(j2.2p . by a factor of 1/n on the right hand side. This has to do with a slight difference 
between our algorithm and the one in |38] . Our algorithm outputs c = ||, whereas 
the algorithm in [38j outputs c = ||oo- By not fully normalizing c, they avoid some 
potential problems with numerical precision; however, this is only a concern when one 
is doing "exact" convex optimization. □ 

In fact, with a slight modification, the above algorithm solves the WSEP problem 
in the approximate setting (but not in the exact setting, because the running time is 
polynomial in 1//3, not log(l//3)). Thus, by combining the previous two lemmas, we can 
get a reduction from WSEP to WMEM in the approximate setting. 

Lemma 2.6 There exists an algorithm B and a polynomial t, such that for any convex 
set K with parameters {n,R,r,p) as defined above, and for any < e < 1, there exists 
5 > rV/16384n5i^^ such that B{{n, R,r,p), . . .) is an oracle reduction from WSEP^ 
to WMEMs, which runs in time t(n, log(l/r), (R/e)). 
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Proof: Using the previous two lemmas, we can give a reduction from WSEP^^^ to 
WMEMs. Then set /? = e/(4i?). Modify the algorithm so that it first checks if > i?, 
and if so, returns c := y/||y||. This algorithm correctly solves the WSEP,; problem: If 
y € S{K,-e/2), it answers "YES." If y ^ S{K,e/2) and ||y|| > R, then c = y/\\y\\ 
defines a separating hyperplane (since for all x e K, \\x\\ < R). If y ^ S{K,e/2) and 
||y|| < R, then we have that for every x e K, c - x < c- y + e/2 + 2i?/3 < c - y + e. □ 

Next, one can use a simple perceptron-like algorithm to reduce from WOPT to 
WSEP in the approximate setting. 

Lemma 2.7 There exists an algorithm C and a polynomial t, such that for any convex 
set K with parameters (n, R, r, p) as defined above, and for any e > 0, there exists 
S > e/3, such that C{{n,R,r,p), . . .) is an oracle reduction from WOPT^ to WSEPg, 
which runs in time t{n,R, {1/e)). 

Proof: Assume we have an oracle for WSEPg; we will specify S later in the proof. We 
wish to construct an algorithm C that solves WOPT^. Let c, 7 and e be given. 
Define the set 

K'{c, 7) = n {.T G R" I c • X > 7}. 

Clearly K'{c,j) has outer radius R. We have to distinguish between the following two 
cases: (1) If there exists a vector y G S{K, —e) with c • ?/ > 7 + e, then K'{c, 7) contains 
a ball of radius e. (2) If for all x G S{K, e), c - x < j — £, then K'{c, 7) is empty. 
We can construct a WSEPg oracle for K'{c,^) as follows: 

Given y G M". 

Run the WSEPg oracle for K on input y. 

If the oracle returns a separating hyperplane s, then return s. 

Else, if c • y < 7, then return — c. 

Else, return "YES." 

Now we construct the following algorithm C that solves WOPT^. (This is essentially 
the same as the classical perceptron algorithm.) 

Given c G M", 7 G M and e > 0. 
Initialize z = (0,...,0) G W. 

Repeat the following at most R^ /{e — 25)"^ times. 
Run the WSEPg oracle for i^'(c, 7) on input z. 
If the oracle returns "YES," then return "YES." 

Else, the oracle returns a separating hyperplane s. Set z = z — {£ — 25)s. 
If the oracle never returned "YES," then return "NO." 

Also, we set 5 = e/3. It is straightforward to see that this algorithm runs in time 
poly(n, i?, (1/e)). It remains to show that the algorithm correctly solves the WOPT^ 
problem. 

First, consider case (2): for all x G S{K,£), c • x < 7 — e. Then the WSEPg oracle 
for K'{c,j) will never answer "YES"; if it did, that would imply y G S{K'{c,j),6), and 
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thus, y G S{K, 5) and c ■ y > ^ — 6, a contradiction. Therefore, the algorithm returns 
"NO." 

Now consider case (1): there exists a vector y G S{K, —e) with c • y > 7 + e. Thus 
K'(c, 7) contains a ball of radius e centered around y. Let zt denote the value of z after 
the i'th iteration of the algorithm. Consider what happens on the (t + l)'st iteration. 
If the WSEPs oracle for K'{c,j) returns "YES," then the algorithm returns "YES," as 
desired. Otherwise, the oracle returns a vector s such that for every x G S{K'{c, 7), —6), 
s-x < s-zt + 6. If we consider the case of x = y + {e — 5)s, we see that s-y+e — 6 < s-zt+6. 
In other words, 

s ■ {y - Zt) < -e + 26. 
This implies that zt+i will be closer to y than zt was. In particular, 

\\zt+i - yf = \\zt - yf - 2{zt - y) ■ {e - 25)s + ||(e - 25)sf 
< \\zt - yf - 2{e - 25)^ + (e - 25)^ 
= \\zt-yf-ie-25f. 

We know that our starting point zq was not too far from y, specifically, \\zq — yf < R"^ . 
Thus, after at most R'^/{£ — 26f iterations, the algorithm will find the point y and 
return "YES." 

Thus, the algorithm correctly solves the WOPT^ problem. □ 

Combining all of these steps, we get a reduction from WOPT to WMEM. 

Proposition 2.8 There exists an algorithm A and a polynomial t, such that for any 
convex set K with parameters {n,R,r,p) as defined above, and for any < e < 1, there 
exists 5 > r^e^/442368n^i2^, such that A{{n, R,r,p), . . .) is an oracle reduction from 
WOPTs to WMEMs, which runs in time t(n, iJ, (1/e), log(l/r)). 

Proof: This follows from Lemmas 12.71 and 12.61 □ 

This directly implies Theorem 12.31 

A few remarks about the precision requirement, i.e., the dependence of 5 on e. 
First, the constant factor of 442368 can be substantially improved by doing a more 
careful analysis. But it is less clear whether one can improve on the overall form of the 
expression r^s^ /n^R^. Note that this expression comes mostly from the reduction from 
WSEPl^ to WMEM. This step also appears in the more sophisticated reductions based 
on the ellipsoid method; so the precision requirement for those reductions is comparable. 
On the other hand, it may be possible to improve on the precision requirement by using 
algorithms based on random walks instead; this would give a randomized (rather than 
deterministic) reduction. 

2.3.2 Round-off Errors 

We now consider the effect of round-off errors in the algorithms described above. We 
claim that if we do all calculations with poly(n) bits of precision, then the errors are 
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negligible. Since we are doing "approximate" convex optimization, rather than "exact," 
our situation is much less delicate than the one in 1381. 



First, some general remarks: We represent numbers using poly(n) bits of precision. 
For simplicity, we use fixed-point notation, where the position of the decimal point is 
fixed. This is less powerful than floating-point notation, but it suffices for our needs. 
See [57j for a detailed discussion of how to implement the basic arithmetic operations. 

If the algorithm returns some number r', and the true answer is r, we want to bound 
the absolute error, i.e., we want to show that |r — r'\ < e. (Alternatively, one could 
bound the relative error, i.e., |r — r'\ < e|r|. But this is less useful for our purposes.) 

Errors come from various sources. When we round a number to poly(n) bits of 
precision, the absolute error increases by 2-P°iy(™), which is not too serious; the real 
concern is that subsequent arithmetic operations can amplify the error. 

The absolute error behaves well under addition and subtraction, but can blow up 
after multiplication by a very large number or division by a very small number. In 
particular, if |r — r'| < e and |s — s'| < 5, then we have the following bounds: 

|(r + s) - (r' + s')| <e + 6, 

\{r - s) - (r' - s')\ <£ + 6, 
\r(s — s') + (r — r')s'\ < \r\6 + e\s\ + e6, 
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r{s' — s) + {r — r')t 
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\r\S + elsl 
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6 + e 



In addition to the usual arithmetic operations, we will occasionally need to calculate 
the square root. This can be done using Newton's method, or just binary search. (Given 
a number r > 0, we want to find some t > such that — r = 0.) The behavior of the 
absolute error depends on the magnitude of 1 — it can blow up when r is very small. 

In particular, suppose r > 0, r' > 0, \r — r'\ < e. In the case where r < r', we have 
that ^/r' < + (this follows from the concavity of the square root function, and 

taking the first derivative at the point r). Thus Vr' — y/r < ^ similar argument 

applies in the case where r > r'. So we have the general bound 



2\J min(r, r') 



Now we consider the algorithms described in the previous section. 

In lemma 12.41 the reduction from WMEM^ to WMEM is quite straightforward. 
We are multiplying and dividing numbers whose magnitude is order R, so we need order 
log(i?) bits of precision. 

In lemma [23} the reduction from WSEP^ to WMEM^ is much more complicated, 
because of the iterative procedure where, on every round, one constructs a simplex 
vi,...,Vn centered around a given vector v. First, let us describe one procedure for 
constructing the simplex. 
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Take the standard basis vectors ei, . . . , G M", where = (0, . . . , 0, 1, 0, . . . , 

0), with a 1 in the z'th coordinate. These vectors define a regular simplex 
in the (n — 1) -dimensional hyperplane {x G M" \ u ■ x = 1}, where 
n= (1,1,...,1). 

Define u = u/\\u\\ and v = and apply a rotation Q that maps u to v. 

Q is given by the formula Q = A + 1 — P, where A is the desired rotation 
within span{u,v), and P is a projector onto span{u,v). We construct A 
and P as follows. Define w = v — {v- u)u, and w = w/WwW. Then u and w 
form an orthonormal basis for span(n, v), and we can write v = au + Pw, 
or equivalently, w = (l//3)(£' — au). (Note: in this paragraph only, a and 
f3 have a completely different meaning from the a and /? used elsewhere 
in the algorithm.) We define 

A = vuF + {—^u + aw)'uF 

= viiF + {—^u + {a/f3){v — au))'uF 

= vvF + {l/f3){-u + av)uF 

= vu^ + {l/0^){-u + av){v - auf. 

And we define 

P = uviF + wvF 

= uiiF + {1/ I3^){v — au){v — auj^ . 

Finally, we scale the simplex so it has the correct shape. Currently, the center 
of the simplex lies at distance l/-v/n from the origin, and the vertices are 
at distance ^1 — (1/n) from the center. We want these distances to 
be (cos^a)||f|| and (sinacosa)!!^!!, respectively. To accomplish this, we 
apply the transformation 

T = ^{cos^ Q)\\v\\viF + (1 - (l/n))'~^/^(sinacosa)||i;||(7 - viF), 

where sin a and cos a are obtained from the formulas 

1 1 

sm a = — , cos a = — . 

V^l + 16n4//32 ^l + /32/16n4 

There are a few places where trouble could occur. First, if the vector v is small, then 
\\v\\ may have a large error. However, we know that the algorithm must stop iterating 
when 11^11 < ri, so v cannot be too small. 

The second difficulty occurs when we construct the rotation Q. If u and v are close 
together, then the vector w will be small, so w may have a large error; furthermore, (5 
will be small, so expressions containing a (1//?) factor may have a large error. However, 
in this case one can avoid the problem by simply skipping this step, and not applying 
any rotation Q. Note that the vertices of the simplex, ei, . . . , e„, are at distance 1 from 
the origin. If Hu — < e, then the correct rotation would move each point by a 
distance of at most e; so omitting the rotation only increases the error by e. 
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Finally, there is the question of how these errors in constructing the simplex affect 
the correctness and running time of the iterative procedure in lemma 12.51 To maintain 
correctness, we must ensure that, even with the errors, each vertex Vi satisfies the follow- 
ing two properties: Vi-v > (cos^ ck)!!'^!!) and the angle between Vi and ij is at least a. We 
can accomplish this by slightly adjusting each point Vi in such a way that the simplex 
moves away from the origin and expands outward. (One can imagine many ways to do 
this adjustment; the details are not important.) 

This, of course, hurts the running time — because of the adjustments, the vectors 
Vi may not shrink as quickly, so the algorithm may need to perform more iterations. 
However, if we use polynomially many bits of precision, then the adjustments will be 
sufficiently small, so that the vectors Vi will shrink quickly and the algorithm will need 
at most polynomially many iterations. In particular, if an adjustment moves a point Vi 
by a distance at most rj, then we have that 

\\vi\\ < (cosq)||v|| + r] < (cosa + ^)||f ||- 

We can bound the number of iterations p, using the same argument as before: 

log((5i/ri) 
^ log(l/(cosa + ^))' 

and one can show that 

Ml/(cosa + iL))>l(_^_|). 

Using polynomially many bits of precision, we can easily ensure that ij/ri < (1/6) 
(/32/17n^); then log(l/(cosa + ^)) > {l/4){(5'^ /17n'^), which means that the algorithm 
will need at most polynomially many iterations. 

Finally, consider the reduction from WOPT to WSEP in lemma [2^71 This algorithm 
also involves an iterative procedure, but it is quite straightforward; note that after each 
iteration, the vector z is updated by an addition operation, so the errors accumulate 
gradually without blowing up. 

2.3.3 The Ellipsoid Method 

In place of the perceptron algorithm, one can use the standard (central-cut) ellipsoid 
method pS] to reduce WOPT to WSEP. This gives a faster running time which is 
logarithmic va. R/r and 1/e, while the precision requirement is comparable to what we 
had before. 

Proposition 2.9 There exists an algorithm A, and there exist polynomials q and t, 
such that for any convex set K with parameters {n,R,r,p) as defined above, and for 
any e > 0, there exists S > l/q{n, (R/r), (1/e)), such that A{{n, R, r,p), . . .) is an oracle 
reduction from WOPT^ to WSEPs, which runs in time t(n, log(i?/r), log(l/e)). 
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The analysis of this algorithm is similar to |38]; the main difference is that, since 
we are doing "approximate" convex optimization, we need to pay more attention to the 
precision required for the WSEP oracle. In particular, we need 6 to be polynomial, not 
exponential, in e. 

First, some notation. Let E{A, a) denote an ellipsoid, 

E{A, a) = {x G I (x - afA-^{x - a) < 1}, 

where A is a positive definite n x n matrix and a G M". Note that E(A, a) = \fAB + a, 
where B is the closed ball of radius 1 around the origin. Also, let Amax(^) and Amin(^) 
denote the largest and smallest eigenvalues of A. Note that = Aniax(^) and H^^"*^!! = 

1/Amm(^). 

To solve the WOPT problem, we have to decide whether the set 

K'{c, 7) = n {x G M" I c • X > 7} 

contains a ball of radius e or is empty. We have access to a WSEP oracle for K'{c, 7). At 
every iteration k, the algorithm computes an ellipsoid E{Ak,ak) that contains iC'(c, 7). 
The required precision 5 for the WSEP oracle scales roughly like \/Xmin{A)- The key 
observation is that if Ajnin(^) < then the ellipsoid E{Aj.,ak) cannot contain K'{c,^) 
unless K'{c,^) is empty; and if this happens, the algorithm can stop and answer "NO." 
Thus, the required precision 5 scales roughly like e. 

A more powerful idea is contained in the shallow-cut ellipsoid method [88\ I38j. This 
gives a reduction from WOPT to WMEM, with a faster running time which is loga- 
rithmic va. R/r and and roughly the same precision requirement as before. 

Proposition 2.10 There exists an algorithm A, and there exist polynomials q and t, 
such that for any convex set K with parameters {n,R,r,p) as defined above, and for 
any e > 0, there exists S > l/q{n, (R/r), (1/e)), such that A{{n, R, r,p), . . .) is an oracle 
reduction from WOPT/, to WMEMg, which runs in time t(n, log(-R/r), log(l/e)). 

Again, this follows from the analysis of the algorithm in |j38j; the only new ingredient 
is the claim that 6 is polynomial in e. 

A key idea is the notion of a shallow separation oracle for a convex set K. This 
oracle solves the following problem: 

Given a positive definite matrix A G R'^^'^ and a vector a G M'^. 
Find a vector c G M", ||c|| = 1, such that for all x £ K, 

c ■ x < c ■ a + Ac. 

Or output "NO" if no such vector c exists. 

This has a simple geometric interpretation. Consider the ellipsoid E{A, a). Now, given a 
vector c, ||c|| = 1, find the point where the ray {o + Ac | A > 0} intersects the boundary 
of the ellipsoid E{{n + l)~'^A,a). Then construct a hyperplane orthogonal to c that 
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contains this point. If K lies entirely behind this hyperplane, then we say that c is a 
"shaUow cut." 

The shallow separation oracle has two important properties: it can be constructed 
from a WSEP^ oracle with /? = l/(n + 2), and it is powerful enough to support a special 
version of the ellipsoid method. 

To solve the WOPT problem, we proceed as follows. We construct a shallow sepa- 
ration oracle for the set K'{c,^). The shallow-cut ellipsoid method works by computing 
a series of ellipsoids E{Ak.,ak) that contain iC'(c, 7). When it queries the shallow sep- 
aration oracle on an ellipsoid E{Ak,ak), the precision required for the WSEP^ oracle 
is roughly Amm(^fe)- Now we make the same observation as before: if Aniin(^fc) < e^, 
then if (c, 7) must be empty, and we can stop the algorithm and answer "NO." Thus 
the precision required for the WSEP^ oracle is roughly e. 

2.3.4 Algorithms using Random Walks 

As an alternative to the shallow-cut ellipsoid method, one can also use some recently 
developed algorithms which are based on random walks in convex bodies \n\ I80j. These 
algorithms actually solve a slightly more general class of convex programs, where the 
objective function / need not be linear; when / is linear, one can use a slightly faster 
algorithm based on simulated annealing 

These algorithms are not necessarily faster or more accurate than the shallow-cut el- 
lipsoid method, but they have other intriguing features. The points where the algorithm 
queries the membership oracle are chosen randomly from some set P (which changes 
over successive iterations of the algorithm). Thus we get a randomized oracle reduction 
from WOPT to WMEM, rather than a deterministic oracle reduction. Also, there is a 
simple reason why the randomized algorithm can tolerate imprecision in the membership 
oracle: most of the points that it queries will not lie close to the boundary of the set P. 
(In contrast, a deterministic algorithm must do some work to correct for possible errors, 
as in Lemma I2. 61 ) 

The analysis given by Bertsimas and Vempala [T7] assumes a real-valued model of 
computation, and does not account for the precision of the membership oracle. However, 
this can be done using techniques due to Lovasz and Simonovits |63j . Here we sketch 
the idea. It would be interesting to prove a tight bound on the precision requirement, 
and see how it compares with the precision requirement of the ellipsoid method. 

The Bertsimas- Vempala algorithm is built around a subroutine that solves the fea- 
sibility problem (the WOPT problem). The basic idea is as follows: 

Given c G M", 7 G M, e > 0. 
Let P be the set K. 

Randomly sample some points from P, and compute an approximate centroid 

of P; call this point z. 
If c • z > 7, stop and output "YES." Otherwise, use the vector c to cut out 

a portion of the set pH 

^Specifically, we can deduce a hyperplane that separates z from the set {x \ c - x > 7}. Then we take 
the intersection of P with the half-space that does not contain z. 
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Repeat the procedure starting from line 3. If P gets too small, stop and 
output "NO." 



The critical step is to sample random points from the set P. (Note that P is convex, 
and we have a membership oracle for P.) One way is to do a random walk known as 
the "ball walk": 

Pick a point y uniformly at random in the ball of radius 5 centered at the 
current position x. If y € P, then move to y, otherwise stay at x. Repeat. 

The points where the membership oracle makes mistakes all lie close to the boundary 
of P; call this the "boundary layer" Pf,. Intuitively, if the boundary layer is thin, it should 
not have much effect on the random walk. Using an argument by Lovasz and Simonovits 
[63], one can prove (omitting some details): 

Lemma 2.11 For any polynomial t, there exists a polynomial q such that, if we run the 
ball walk for at most t{n) steps, and vol(Pb)/ vol(P) < l/q{n), then with probability 2/3 
we will never enter the region P^. 

So, if we can show that the boundary layer is small compared to the total volume of 
P, then our algorithm will work fine. (As long as the random walk does not enter 
the boundary layer, the algorithm will perform exactly as if it had access to a perfect 
membership oracle.) 
Define the set 



The algorithm has to distinguish between the following two cases: (1) If there exists a 
vector y G S{K, —e) with c • y > 7 + e, then K'{c^ 7) contains a ball of radius e. (2) If 
for all X G S{K, e), c • x < 7 — e, then K'{c, 7) is empty. 

In case (1), the set P always contains a ball of radius e. Let p be the polynomial 
such that after at most p{n) steps we will find a solution with the desired precision e 
(assuming a perfect membership oracle). Let q be the polynomial given by Lemma [2.111 
Now set the precision of the membership oracle to be (5 = e/ (2ng(n)). We will show that 
the boundary layer P^ is small compared to the total volume of P. Define P"*" to be the 
set P expanded by an amount 5, that is, P*^ = P + 6B, where B is the unit ball. We 
have that 



vol(P+) < (1 + V^)" vol(P) < e^/(29(")) vol(P) < (1 + l/g(n)) vol(P). 
So we can conclude that vol(Pb) < vol(P^) — vol(P) < {1 / q{n)) yo\{P) . Therefore, by 



Lemma I2. Ill the algorithm will work correctly in this case. 

In case (2), it is easy to see that, as long as the precision of the membership oracle 
satisfies 5 < e, the oracle will never answer "YES," and so the algorithm will output 
"NO." 



K'{c, 7) = n {x G I c • X > 7}. 



P+ C P + {5/e)P 
where the equality holds because P is convex. 
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2.4 Consistency is QMA-hard 



Theorem 2.12 Consistency is QMA-hard, via a poly-time oracle reduction from Local 
Hamiltonian. Furthermore, the reduction uses the same value of k for both problems, so 
we get that Consistency with k = 2 is QMA-hard. The reduction yields an instance of 
Consistency with (3 > - a)^/4^^''m^^). 

We will prove this theorem in the following sections. First we describe the basic idea of 

the reduction, which uses convex optimization with a membership oracle; we also discuss 
some of the technical complications that arise. Next, wc show how to write our convex 
program in a particular form that is needed for the reduction. Finally, we deal with the 
issue of numerical precision, and prove the theorem. 

2.4.1 The Basic Idea 

Wc want to solve the Local Hamiltonian problem, i.e., to estimate the smallest eigenvalue 
of a local Hamiltonian H = Hi + • • • + Hm, where Hi acts on the subset Cj. To this end, 
we consider the following convex program: 

Let p be any 2" x 2" complex matrix. 
Find some p that minimizes ti: (Hp), 
such that pyO and tr(/9) = 1. 

It is easy to see that H has an eigenvalue < 7 if and only if the convex program has 
optimal value tr{Hp) < 7. (Note that, although the convex program allows mixed states 
p, the optimal solution p can always be chosen to be a pure state.) Unfortunately, this 
convex program has 4" variables, so solving it requires exponential time. 

We now construct another convex program, which is equivalent to the previous one, 
but has only a polynomial number of variables: 

Let pi, ■ ■ ■ ,Pm be complex matrices, where pi has size 2l'^»l x 2l*^*L 
(We interpret each pi as the reduced density matrix for the subset C^.) 
Find some pi, . . . , pm that minimize tr(i7ipi) + • • • + tr(iJ^/9^), 
such that each pi satisfies pi ^0 and tr(/9j) = 1, 
and pi, . . . , pm are consistent. 

Note that consistency implies that pi ^ and tr(p.j) = 1, so these constraints are 
redundant. One can easily check that the set of feasible solutions is indeed convex: if 
(pi, . . . , Pm) are consistent, and {p[,. . . , p^) are consistent, then any convex combination 
(p'/, . . . , Pm), where p" = qpi + (1 — q)p'i (0 < q < 1), is also consistent. 

The optimal value of this convex program is equal to the optimal value of the previous 
convex program; this is because, if pi, . . . , pm, are consistent with some n-qubit state a, 

then ii{Ha) = ti{Hipi)-\ |-tr(i7mPm). Also, the number of variables is Ym^i 4'*^'' < 

A^m, which is polynomial in the length of the input. 

This convex program has a "consistency" constraint, which we do not know how to 
evaluate. But if we have an oracle for the Consistency problem, then we can solve this 
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convex program in polynomial time, using the techniques from the previous section. Let 
K be the set of feasible solutions, 

K = {{pi, . . . , pm) which are consistent}. 

Local Hamiltonian is equivalent to the WOPT problem, and Consistency is equivalent 
to the WMEM problem. So we can apply Theorem 12.31 which shows a poly-time oracle 
reduction from WOPT to WMEM. 

We have to deal with a couple of technical issues. First, in order for the reduction to 
work, the set K must contain a ball of radius r, and be contained within a ball of radius 
R, where R/r is at most polynomially large. In particular, K cannot lie in a lower- 
dimensional subspace. This requires us to represent each element (pi, . . . , Pm) G in a 
way that has the right number of "degrees of freedom." 

We could represent {pi, . . . , pm) by writing down the matrix entries for the pi, to 
form a vector in C'', d = Ei^i^''^''- But this won't work, because the pi must sat- 
isfy some algebraic constraints, in order to be consistent: each pi must be Hermi- 
tian, (/Oj)^ = Pi, and pi and pj must agree on their intersection Cj H Cj, that is, 
tT^c,-{CinCj){Pi) = trCj-(C,nCj)(/Oi)- These constraints imply that the set K actually 
lies in a lower-dimensional subspace of C^. In the next section, we will show how to 
represent (pi, . . . , Pm) in a way that satisfies these constraints automatically. 

The other issue concerns numerical precision. Local Hamiltonian and Consistency 
are equivalent to the WOPT and WMEM problems with inverse-polynomial precision. 
For our reduction, we will bound the amount of precision required of the Consistency 
oracle, in terms of the precision desired for the Local Hamiltonian problem. 

2.4.2 How to represent (pi, . . . , pm) 

We will represent each element of K using the expectation values of the "local" Pauli 
matrices on the subsets Ci, . . . , Cm- These local Pauli matrices form a basis for the space 
of all local Hamiltonians (acting on the subsets Cj). For an n-qubit state a, knowing 
the expectation values of these Pauli matrices is equivalent to knowing the projection of 
a onto this subspace; and this is equivalent to knowing the local density matrices of a. 

First, some notation. Let P be an n-qubit Pauli matrix, P = ^i- Define the 

"support" of P be the set of qubits on which P acts nontrivially; that is, supp(P) = 
{i \ Pi / /}. Also, for any subset of qubits C, define the "restriction" of P to C, 

P\c = ^,ecPi- 

Define Si to be the set of Pauli matrices supported on Cj, excluding the identity 
matrix because its expectation value is always 1: 

Si = {P G I supp(P) C Ci} - {/}. 

Let S = UI^i '5«> this is the set of all "local" Pauli matrices. Let d = \S\, and note that 
d < 4}^m — 1, which is polynomial in the length of the input. 

For each local Pauli matrix P G 5, let ap be the corresponding expectation value; 
and let {ap)p^s denote the collection of these ap. We define the set K' ^W^, 

K' = {{ap)p^s which are consistent}. 



34 



where we say the ap are "consistent" if there exists an n-qubit state a such that for all 
P G 5, ap = tr(P(T). Clearly the set K' is convex. 

So we can restate our convex program using the expectation values ap [P € 5), 
rather than the density matrices . . . , Pm): 

Let ap (for P G 5) be real numbers. 
Find some ap that minimize 

ra ^ 

^^(tr(/f,)+ J] aptr(iJ,(P|cJ)), 

such that (Q;p)pe5 G i^' (i.e., the ap are consistent). 
This is justified by the following two lemmas: 
Lemma 2.13 There is a linear bijection between K and K' . 

Proof: Given some (pi, . . . , Pm) G K, wc can construct (ap)pe5 G K' as follows: 

For each P G 5: We know that P £ Si for some i. So we can write P in the 
form P = {P\ci) I- Then we set ap = tr((P|c.)pj). 

If the Pi are consistent with some n-qubit state cr, then the ap are also consistent 
with cr. To see this, write ap = tr((P|cJpi) = ti{Pa). (Note that in the case where 
supp(P) C CiHCj, it makes no difference whether we pick i or j in the above procedure, 
because pi and pj yield the same reduced density matrix on Ci D Cj.) 

Going in the opposite direction, given some {ap)pQS £ ^' ^ we can construct {pi, . . . , 
Pm) G if as follows: 

For each i = 1, . . . ,m: We construct pi by using the ap for all P G Si. Note 
that we can write P in the form P = (P|c.) /. We set 

PeSi 

If the ap are consistent with some n-qubit state a, then the pi are also consistent with 
a. To see this, write a in terms of the ap, where we now include the expectation values 
ap = ti {Pa) for ah P G :P®", 

1 

Note that when we trace out the qubits not in Ci, wc get that tr{x^^^^^„j._(7. (P) equals 
2"-|C»l(p|(^.) if supp(P) C d, and otherwise. Thus we have 

tT{i,...,n}-cM) = ^ Yl MP\Ci) = Pi- 

P:supp(P)CCi 

Finally, observe that these maps (between K and K') are linear, and they are inverses 
of each other. □ 
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Lemma 2.14 The optimal value of this convex program is equal to the smallest eigen- 
value of the local Hamiltonian H = Hi + • • • + Hm ■ 

Proof: This follows from the remarks in the previous section, and Lemma I2.13[ In 
particular, we have that 



Y^ir{H,p,) = Y^iT{Hi^^{l+Y,o^p{P\c:, 

i=l i=l ' Pe5i 

(Note that we view Hi as an operator acting on the subset of qubits Ci only, not the 
entire system. So tr(iJj) is a trace over 2l'-'»l dimensions.) □ 

Next, we prove some bounds on the geometry of the set K' C M'^. 

Lemma 2.15 K' is contained in a ball of radius R = \fd centered at the origin. 

Proof: Suppose (ap)pg5 G if', and say it is consistent with some state a. Since ctp = 
tr(P(T), it follows that —1 < ap < 1, which implies the result. □ 

Lemma 2.16 The ball of radius r = around the origin is contained in K' . 

Proof: Let (ap) p^s be any vector in of length at most 1 / \fd. By the Cauchy-Schwartz 
inequality, X]pe<s l^^^"! — P ~ Spe5 I'^^'l- Now define a = (l/2")(/ + X^pg^ apP). 
This is a legal density matrix, because it can be written as 



a = ^(^il-p)I+^i\ap\I + apP) 
Pes 

/ v-^ / + sign(ap)P 
= (1 — p) h > \ap\ 

P&S 



which is (with probability 1 — p) the fully mixed state, and (with probability lap], for 
P £ S) the mixture of all eigenstates of P with eigenvalue sign(ap). Furthermore, the 
ap are consistent with a; thus we conclude that (ap)pg5 G K'. □ 



2.4.3 Numerical Precision 

In this section we deal with the issue of numerical precision. We give reductions from 
Local Hamiltonian to WOPT, from WOPT to WMEM (using the general tools of 
section 2.3), and finally from WMEM to Consistency. 

(Note added later: one can simplify these proofs by using a slightly different reduc- 
tion, from WOPT* to WMEM* , which is described in section 3.6.) 

Lemma 2.17 There is a poly-time mapping reduction from Local Hamiltonian to 
WOPTj/pgiy (on the set K' ). This reduction yields an instance of WOPT with e > 
0((6-a)/(2'=m3/2)). 
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Proof: We have an n-qubit system, and subsets Ci, . . . , Cm C {1, . . . , n}, where \Ci\ < k. 
Accordingly wc define S to be the set of local Pauli matrices, and let d = \S\. We 
let a = {ap)peS denote a vector of expectation values of local Pauli matrices. Then 
K' = {a G U.'^ \ a is consistent with some n-qubit state a}. Note that d < A^m — 1 is 
polynomial in the length of the input to the Local Hamiltonian problem. 

We are given a local Hamiltonian H = Hi, two numbers o, 6 G M, and a unary 
string such that b — a > 1/s. (Note that \\H\\ < Yl'iLi\\Hi\\ < m, so we can assume 
|a|, \b\ < m.) If H has an eigenvalue < a, we should answer "YES"; if all eigenvalues of 
H are > b, we should answer "NO." 

We will reduce this to an instance of WOPTi/p^iy. In this problem, one is given 
c G M*^, ||c|| = 1, 7 G M, e G R, and a unary string "1*," such that e > 1/t. If there 
exists some y G S{K' , — e) such that c • y > 7 + e, then we should answer "YES"; if for 
all X G S{K', e), c • a; < 7 — e, then we should answer "NO." 

As shown in the previous section, the smallest eigenvalue of H is equal to the optimal 
value f{a) for the following convex program: find some a E K' that minimizes the 
function 

i=l ' P&Si 

We can write f{a) using simpler notation. For each i = 1, . . . ,m, define a vector 
Vi = {'ni,p)peS, where rji^p = 2~l'^*l tr(i7j(P|Q)) if P is supported on Q, and rji^p = 
otherwise. For each i = 1, . . . ,m, also define a scalar Vi = 2~l'^»l tr(iJj). Then we can 
write 

m 

/(") = E^^^ + " ■ 
1=1 

Define rj = J2^i Vi and u = Yl^i ^i- Then we can write 

/(a) = v + a ■ rj. 

In addition, we can bound the size of rj and u as follows. Observe that Hi can be 
written in terms of rji and fj, 

Hi = UiI+ ^ r]i^p{P\ci). 
PeSi 

Therefore 

WHiWl = triHf) = 2\^H-f + Y: vIp) = + hif). 

PeSi 

Also, note that \\Hi\\l < 21*^'! So we conclude that {vil < \\Hi\\ = 1 and ||r7i|| < 
lli^ill = 1. Hence, < m and \\r]\\ < m. 

Now we construct an instance of WOPTi/^^iy as follows. Let c = — We will 
specify 7 and s later in the proof. 

Consider what happens on a "YES" instance of Local Hamiltonian. There exists 
some a* G K' such that r] ■ a* < a — v. Furthermore, we claim that there exists a point 
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a in the interior of K' such that • a is not much larger than a — v. To see this, let a* 
be the n-qubit density matrix corresponding to a* . Now consider the density matrix 

(l-5)c7*+g(//2") + ^«p(P/2"). 

P65 

This is a legal density matrix (positive semidefinite with trace 1) provided that < g < 1 
and ^pg5 l^ipl < When we write down the expectation values of the local Pauli 
matrices P G S, this density matrix corresponds to the point (1 — q)a* + u. This point 
is in K' provided that < < 1 and \\u\\i < q. Note that ||«||i < \/d||it||. We conclude 
that a ball of radius q/y/d around the point (1 — g)a* is contained in K' . In other words, 

{l-q)a* G S{K',-q/Vd). 

Also, note that 

77 • ((1 - q)a*) < {1 - q){a - u) < a - u + 2qm. 

Now let q = es/d (assuming e < l/-\/d). We have shown that there exists some a G 
S{K', —e), such that rj ■ a < a — v + 2£y/dm. This implies 

—c-a< J. — n-(a — + 2£Vdm). 

ml 

We will choose 7 and e so that the right side of this inequality equals —7 — e. Then this 
is a "YES" instance of WOPTi/p^iy. 

On the other hand, suppose we have "NO" instance of Local Hamiltonian, so that 
for all a e K' , r) ■ a > b — u. Furthermore, for all a close to K', r)-ais not much smaller 
than b — u. In particular, using the fact that < m, we get that for any a G S{K', e), 

rj ■ a > b — v — em. 

This implies 

1 /, 

— c • a > "H— n"(" — — em). 

WvW 

We will choose 7 and e so that the right side of this inequality equals —7 + e. Then this 
is a "NO" instance of WOPTi/p^iy. 

Now we choose 7 and e. We set 7 according to 

—7 = J. — — + 2eVdm) + e = j. — n-(& — — em) — e. 
\\ri\\ \\ri\\ 

In order for this to work, e must satisfy the equation 

2e = J — 77 (b — a — em — 2eVdm), 

\m 

which has a solution 

e = ^ > > n{{b - a)/(2*^m3/2)). 

2\\r]\\ + m + 2Vdm {2Vd + 3)m 

(Note that e is inverse-polynomial in the length of the input.) This concludes the proof. 
□ 
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Lemma 2.18 There is a poly-time mapping reduction from WMEM^/p^iy (on the set 
K' ) to Consistency. This reduction yields an instance of Consistency with P > 6/ . 

Proof: We have an n-qubit system, and subsets Ci, . . . , Cm ^ {1, • • • i n}, where |Cj| < k. 
Accordingly we define S to be the set of local Pauli matrices, and let d = \S\. We 
let a = {ap)p£s denote a vector of expectation values of local Pauli matrices. Then 
K' = {a ^ M'^ I a is consistent with some n-qubit state a}. 

We will eventually use this lemma as the final step in a reduction from Local Hamil- 
tonian. Note that d < A^'m — 1 is polynomial in the length of the input to the Local 
Hamiltonian problem. 

The WMEMi/p^iy problem is as follows. We are given a G M'^, 5 G M, and a unary 
string where 6 > 1/s. If a G S{K',-6), we should answer "YES." If a ^ S{K',6), 
we should answer "NO." 

We reduce this to the following instance of the Consistency problem. We construct 
the local density matrices pi, . . . ,pm from the expectation values ap (P G S), as de- 
scribed in Lemma [2T3l We set /? = 6/Vd. Note that (3 > 5/{2^ y/rn) is inverse- 
polynomial in the length of the input to W MEM , and it is also inverse-polynomial in 
the length of the input to Local Hamiltonian. 

Clearly, a "YES" instance of W MEMij^oij maps to a "YES" instance of Consistency. 
Now suppose we have a "NO" instance of W MEMij^^iy. Then for all n-qubit states o", 

{Y,{tr{Pa)-apff''>5. 

Thus there is some P G 5 such that | tr(Pcj) — ap\ > 5/^/d. We know that P is 
supported on some subset Cj, so we can write P = P®I where P acts on Cj. Note that 
ap = tr(Ppj). Also, let a = tr|i_ (a). Then we have 

|tr(Pa) -ti{Ppi)\ > 6/y/d. 

We will use P to construct a measurement (POVM) that distinguishes between a 
and Pi. Since the eigenvalues of P are all ±1, we can write P = 111 — 112, where IIi 
and 112 are projectors on orthogonal subspaces, and Hi -|- 112 = I- Thus {111,112} is a 
POVM. For the state cr, let Sj be the probability of measuring j (for j = 1, 2); and for 
the state pi, let be the probability of measuring j (for j = 1, 2). 

Then we have 

|tr(Pa) - ti{Ppi)\ = \{si- 82) - (ri - r2)| = 2|si - n]. 

Observe that the ii distance between s and r is — r||i = |si — ri| + |s2 — ^^2! = 2|si — ri|. 
Also, this is a lower bound for the Li (matrix) distance between a and pi. So we have 

\\^ — Pi\\i > \\s — r\\i > = p. 

Thus we have a "NO" instance of Consistency. □ 

We are now ready to prove that Consistency is QMA-hard. 
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Proof of Theorem 12 .12 1 Use the previous two lemmas, and the reduction from WOPTf, 
to WMEMg in Theorem l2.31 Note that by Proposition 12.81 the properties of the set 
K', the reduction from WOPT;, to WMEMs has the following precision requirement: 



2.5 Discussion 

Consistency of local density matrices is an interesting problem that gives some new 
insight into the class QMA. The reduction from Local Hamiltonian is nontrivial, and in 
that sense, Consistency seems to be an easier problem to deal with. One direction for 
future work is to try to find additional QMA-complete problems by giving reductions 
from Consistency (rather than from Local Hamiltonian). 

Another question is whether Consistency remains QMA-hard under mapping reduc- 
tions. We mention that we can build zero-knowledge proof systems for Consistency 
[59] , using techniques developed by Watrous ^85j. If we could show that Consistency is 
QMA-hard under mapping reductions, then we could get zero-knowledge proof systems 
for any language in QMA. 
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Chapter 3 



A/^-representability is 
QMA-complete 

(This chapter is joint work with Matthias Christandl and Prank Verstraete.) 

3.1 Introduction 

The central theoretical problem in the field of many-body strongly correlated quantum 
systems is to find efficient ways of simulating Schrodinger's equations. The main dif- 
ficulty is the fact that the dimension of the Hilbert space describing a system of N 
quantum particles scales exponentially in N. This makes a direct numerical simulation 
intractable: every time an extra particle is added to the system, the computational 
resources would have to be doubled. 

The situation is not hopeless, however, as in principle it could be that all physical 
wavefunctions, i.e., the ones that are realized in nature, have very special properties 
and can be parameterized in an efficient way. The idea would then be to propose a 
variational class of wavefunctions that capture the physics of the systems of interest, 
and then do an optimization over this restricted class. This approach has proven to be 
very successful, as witnessed by mean field theory and renormalization group methods. 
However, it is still an open problem to find an efficient variational class to describe 
complex wavefunctions such as those arising in quantum chemistry. 

One of the basic problems in quantum chemistry is to find the ground state of a 
Hamiltonian describing the many-body system of an atom or molecule. Here one is 
mainly interested in the behavior of the electrons; the nuclei are assumed to be fixed, 
possibly in some non-equilibrium geometry. These Hamiltonians are very ungeneric, 
because they contain at most 2-body interactions. This implies that the number of free 
parameters in such Hamiltonians scales at most quadratically in the number of particles 
or modes, and hence the ground states of all such systems form a small-dimensional 
manifold. 

For a Hamiltonian with only 2-body interactions, the energy corresponding to a 
wavefunction is completely determined by its 2-body correlation functions, and as a 
consequence the ground state will be the one with extremal 2-body reduced density 
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operators. This fact was realized a long time ago, and led Coulson [30l [78] to propose 
the following problem: given a set of N quantum particles, can we characterize the 
allowed sets of 2-body correlations or density operators between all pairs of particles? 

If the particles under consideration are fermions, as is the case in quantum chemistry, 
this has been called the N -representability problem |2H]. Here, we consider the reduced 
density operators acting on pairs of fermions, and we want to decide whether they are 
consistent with some global state over fermions. An efficient solution to the A^- 
represent ability problem would be a huge breakthrough, as it would (for example) allow 
us to calculate the binding energies of all molecules. Therefore, a very large effort has 
been devoted to solving this problem [291 (13 [M] • 

Here we will give strong evidence that the A'^-representability problem is intractable, 
as it is QMA-complete and hence NP-hard. By "intractable," we mean that, for large 
N, solving the problem in the worst case requires a number of operations that grows 
exponentially in N. The complexity class QMA (Quantum Merlin-Arthur) is the natu- 
ral generalization of the class NP (nondeterministic polynomial time) to the setting of 
quantum computing. Colloquially, a problem is in QMA if there exists an efficient quan- 
tum algorithm that, when given a possible solution to the problem, can verify whether 
it is correct; here the "solution" may be a quantum state on polynomially many qubits. 
A problem is QMA-hard if it is at least as hard as any other problem in QMA; that 
is, given an efficient algorithm for this problem, one could solve every other problem in 
QMA efficiently. We say that a problem is QMA-complete if it is in QMA and it is also 
QMA-hard. 

In a seminal work, Kitaev [53] proved that the Local Hamiltonian problem — de- 
termining the ground state energy of a spin Hamiltonian that is a sum of 5-body terms 
(on n qubits), with accuracy ite where e is inverse polynomial in n — is QMA-complete. 
In fact, it was later shown that this problem remains QMA-complete when restricted 
to 2-body interactions [51], and even in the case of geometrically local interactions [69j . 
In this paper, we extend these results to fermionic systems, and show that Fermionic 
2-Local Hamiltonian is QMA-complete. 

Another problem is to decide whether a given set of local density operators is con- 
sistent, i.e., whether they can be realized as the reduced density operators of the same 
global state. In a certain sense, this is the dual of the Local Hamiltonian problem (see 
chapter 4 of this dissertation). The consistency problem has been studied for spin sys- 
tems, and it was recently shown to be QMA-complete (see chapter 2 of this dissertation) 
[60]. In the present paper, we will prove that A^-representability, which is the fermionic 
version of the consistency problem, is also QMA-complete. 

3.2 Fermions 

We review some basic facts about fermions; see |77j for more on this, and other topics in 
quantum chemistry. Consider a system of N particles, where each particle has d energy 
levels, and the particles obey Fermi statistics. (For instance, we might have N electrons, 
and we fix a basis set consisting of d single-electron orbitals.) We assume d > N. Since 
the particles are fermions, we only allow A^-particle states that are antisymmetric under 
exchanges of pairs of particles. This implies that no two particles can occupy the same 
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state (the Pauh exclusion principle); hence the assumption that d> N. Also, we assume 
that the interactions in the system do not create or destroy particles, so we are interested 
in states with exactly N particles. 

We will now construct a basis for the space of A^-particle fermionic states. Let 
\ipi), . . . ,\ipd) be an orthonormal basis for a single particle. Fix an ordering of the 
particles, from 1 to A^. For any indices G {!,..., d}, we can construct an 

A'^-particle fermionic state using a "Slater determinant" : 

N 

-I 1 



a,b=l ^„ , 



Here we construct a matrix whose (a,6)'th entry is 'f^i^ , which means that the a'th 
particle is in state then we take its "determinant" and get a superposition of 

tensor product states. Note that the determinant is nonzero if and only if the Zi, . . . , zjv 
are distinct, i.e., no two particles can be in the same state. Also, changing the order of 
the ii, . . . ,i7v only affects the sign of the determinant. We adopt the convention that 
the ii, . . . ,iN always appear in increasing order. For any I C {!,..., d}, \I\ = N, we 
define 

where / = {ii, . . . jIn}, and ii < ■ ■ ■ < in- There are (^) states of this form, and they 
form an orthonormal basis for the space of all A^-particle fermionic states. 

Let £7 be a density matrix describing an A^-particle state; then the 2-particle reduced 
density matrix (2-RDM) is given by 

P^^^ = tr3,...,jv((T). 

This is a matrix of dimension (2) x (2). Since the A?^-particle state is antisymmetric, 
the 2-RDM is the same for every pair of particles. Also, note that it is not necessary 
to know anything about the single-particle states . . . , besides the fact that 
they are orthonormal. The partial trace and the question of A?^-representability do not 
depend on the choice of basis. 

We are also interested in fermionic local Hamiltonians, that is, Hamiltonians on N 
fermions, that consist of the same 2-particle interaction acting on every pair of particles: 



Here, A is a matrix of dimension (2) x (2) • 



3.2.1 Second-Quantized Operators 

"Second quantization" provides a nice way to describe fermionic systems. The basic 
idea is that, rather than dealing with the individual particles, one should pay attention 
to which of the states . . . , l^?^) are occupied. This gives a unified way of describing 
states with different numbers of particles. It is particularly helpful in dealing with 
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the 2-RDM, because it avoids the messy step of tracing out the other N — 2 particles. 
This clarifies the relationship between the A''-particle state and the 2-RDM. (Note: the 
formalism of second quantization is used in our proofs, but is not needed in the statement 
of our results.) 

Let Vn denote the space of A?^-particle fermionic states. We will consider the space 
of all fermionic states, where the number of particles varies from to d; this is given by 

d 

N=0 

(This is known as Fock space.) Note that states with different numbers of particles lie 
in orthogonal subspaces. Generally, we will only be interested in states with a fixed 
number of particles N, so the state is described by a density matrix whose support lies 
in the subspace V/y. However, we will find it useful to define operators (e.g., observables 
and Hamiltonians) that act on the whole space V. In particular, we will do this with 
local observables and local Hamiltonians — here, the operator acts identically on all pairs 
of particles, so its meaning is independent of the total number of particles N. 

Annihilation and creation operators are the basic tools for working in Fock space. 
For every i G {1, . . . ,d}, we define the annihilation and creation operators, Oj and a|, 
by describing how they act on the Slater basis states 

ai\ipi) = if i ^ / 

al\ipi) = if i G / 

= (-l)/(^'^)|V9,uw)ifi^/, 

where f{I,i) = \{j G / | j < Intuitively, Ui annihilates a particle in state \<fi), or 
returns if no such particle exists, while a\ creates a particle in state \(pi), or returns 
if such a particle already exists. (Thus, given an AT-particle state, returns an {N — 1)- 
particle state, while a| returns an {N + l)-particle state.) The particle is annihilated or 
created in the first (far left) column of the Slater determinant; moving it to its proper 
position, among the elements of I in ascending order, produces the (— 1)/(^'*) phase 
factor. (Note that a| is indeed the adjoint of Oj.) 

Note that an A^-particle Slater basis state \(pi), where I = {ii, . . . , zat}, h < ■ ■ ■ < in, 
can be written in the form 

where jO) is the state with zero fermions, i.e., the vacuum state. Also, any A^-particle 
state \tp) can be written in the form 

31, ■■■,3d e {0,1} 

31 ^ h = 
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Also note that Ui and a| satisfy the following anticommutation rules: 

t X t 

Ojaj- = Oij - a'jai. 

Second quantization gives a convenient expression for the 2-RDM. First, if {ip) is an 
A^-fermion state, then a straightforward calculation shows that 



N 

That is, taking the inner product with \ipi) on the first particle is equivalent to applying 
the annihilation operator a^. Similarly, when we act on the first and second particles, 
we get that 

((^,1 (^,1 0/^(^-2)) 1^) = _^=l==a,-a#). (3.1) 

Now suppose pPl = tr3^...^Ar is the 2-RDM corresponding to 1-0). Then the 

matrix elements of pl^l are given by 

= jv(jv^_ 1^ tr((4afaja^)|V;)(V'|). 

That is, the matrix elements of pl^l are equal to the expectation values of products of 
annihilation and creation operators. This extends to the general case, where the N- 
particle state is described by a density matrix a, the corresponding 2-RDM is pl^l = 
tr3,...,Ar(c), and we have that 

= N{N- 1) (3-2) 

Note that p^^^.^ = if i = j or A; = Z; this is consistent with the fact that no two fermions 
can occupy the same state. 

Second quantization also gives a convenient expression for a fermionic local Hamil- 
tonian H = J2i^j First, we write down the matrix elements of A: 

Ajki = ® {(pj\^A(^\(pk) ® \ipi)y 

Observe that, for any AT-fermion state \tp), 
{iIj\H\iP) = N{N - l){'tP\A^^^^\ilj) 

= N{N - J2 Aijki{{\'Pi) ® {{^k\ ^ i'Pil) ® IV) 

ijkl 

= {'il^\'^Ajkl{ai(^]aiak)\ip), 

ijkl 
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where we used the antisymmetry of the state IV'), and equation ()3.ip . Thus we can write 
H in the following form: 

H = 'y^_^Aijki {a\a]aiak)- 

ijkl 

Note that those matrix elements Aijki with i = j or A; = / do not contribute to the sum; 
this is because we only consider the action of A on fermionic states. 

3.2.2 Two-Particle Observables 

We construct a complete set of 2-particle observables. First, define aj = ai^Ui-^, for all 
pairs of modes I = {11,12}, ii < 12- Also fix an ordering on the pairs I. Let L denote 
the last pair in the ordering (so I < L, for all I ^ L). We now define the following 
observables: 

Xjj = a\aj + a^jttj, for all I ^ J, (3-3) 
17 J = —ia\aj + ia^jQi, for all I ^ J, (3-4) 
Zj = a\aj, for all /. (3.5) 

These operators are Hermitian, with eigenvalues in the interval [—1, 1]. Let S be the set 
of all these observables, except for Z^. Note that \S\ < d'^. 

Taking real linear combinations, the operators S £ S form a basis for the space of all 
2-local fermionic Hamiltonians, i.e., any 2-local fermionic Hamiltonian can be written in 
the form 

H = -/ol +"^733, 70,75 G IK- 
ses 

Note that these observables can act on states with arbitrary numbers of particles. 
In particular, they can act on an A^-particle state a, or on the corresponding 2-RDM 
p = tr3^,,,^Ar(c7). The expectation values are the same up to a normalization factor: 

tr{Sp)= ^J_.. tv{Sa), SeS. 



The observables S £ S are especially useful for working with 2-particle states. In 
particular, the expectation values of S contain complete information about the state. 
To see this, let us restrict S to act only on the space of 2-particle states. Then each 
annihilation operator a/ "picks out" a single Slater basis state \(pi), and so the operators 
S can be written in the following simple way: 

= \fi){fi\ 
Xij = + 
Yij = -i\ipi){ipj\ +i\ipj){ipi\. 

Note that Zi is a projector onto the state \(pi), while Xjj is a rank-2 operator with 
eigenvalues ±1 and eigenvectors -^{\fi) i and Yjj is a rank-2 operator with 

eigenvalues ±1 and eigenvectors "^(Iv'/) i^l'/'j))- These operators have the following 
orthogonality properties: 
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A 


D 


iiiAB) 


Zi 


Zv 


1 if / = otherwise 


Zi 


Xi'ji 





Zi 


Yvr 





Xij 


Xpji 


2 if / = /' and J = J', otherwise 


Xij 


Yvj, 





Yij 


Yrr 


2 if I = /' and J = J' , otherwise 



(Some of these identities also hold when we consider A^-particle states. However, Zj and 
Zp are not orthogonal when we view them as operators acting on A^-particlc states.) 

Prom these orthogonality properties, it follows that any 2-particle state p can be 
written in the form 

P = Zl + Y^ a{Zi){Zi -Zl) + \'Y^ a^Xij)Xij + 5 X] ^{yij)^iJ' 
KL KJ KJ 

where 

= tr(Z/p), for all I ^ L, 
a{Xij) = ^T^{Xijp), for all I ^ J, 
a(Yij) = tJ:{Yijp), for all I -< J. 

(The coefEcient in front of Zl is fixed due to the fact that p has trace 1.) Note that the 
as are simply the expectation values of the observables S, that is, as = tv{Sp), for all 
S eS. 

One application of this is to distinguish between two different 2-particle states, p 
and p'. We claim that the £1 distance ||p — p'||i, and the difference in expectation values 
\tT{Sp) — tr(S'p')|, are related up to a polynomial factor. More precisely, we show the 
following: 

Lemma 3.1 There exists some S G S such that \ tr{Sp) - ti{Sp')\ > \\p - p'\\i/2d^. 
Also, for allSeS, | tr(5p) - tr(Sp')| < 

Proof: Por the first claim, we let as = tr(S'p) and a'g = tr(S'p'), and we write 

P-P = ^{a^Zj)-a[zj)){Zi-ZL)+l ^{a{Xij)-a[xjj))Xij+^ ^{a(Yjj)-a[Yjj))Yij. 

KL KJ 

By the triangle inequality, and using the fact that \\Zj — Zl\\i, ||17j||i < 2 when 

we view these as operators on 2-particle states, we get 

Hp - p'lli < 2 X \a^Zj) - a[zj-j\ + " + Yl \^iYjj) - 

KL KJ KJ 

SO there must be some S G S such that 

I / 1 ^ \\p- p'Wi ^ \\p-p'\\i 
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Now we show the second claim. For any S E S, let p be the distribution of the 
outcomes when one measures S on the state p, and let p' be the distribution of the out- 
comes when one measures S on the state p' . Then, using the fact that the measurement 
outcomes are in the range [—1, 1], we have that 

\tv{Sp)-tv{Sp')\<\\p-p'\\i<\\p-p'\\i. 

□ 

3.3 The A/^-representability and Fermionic Local Hamilto- 
nian problems 

We have a system of N electrons, and a basis set consisting of d single-electron orbitals. 
(The nuclei are assumed to be fixed, possibly in some non-equilibrium geometry.) For 
our purposes, is the parameter that describes the size of the system, d is typically 
much larger than A'^, and the space of A^-clectron states has dimension '■ ii d > cN 
for some constant c > 1, then this grows exponentially in N. However, in practice d 
cannot be chosen too large, because the 2-RDM, and the 2-electron interaction in the 
Hamiltonian, are described by matrices of dimension (2). We will be mainly interested 
in cases where N < d < poly(A^). We would like to solve iV-representability, or find 
ground state energies, with additive error ±1/ poly(A''). 

Formally, we define the AT-representability problem as follows: 

Consider a system of N fermions, where each particle has d energy levels. 
We are given a 2-particle density matrix p, of size (2) x (2) . In addition, we 
are given a string "1*" (the unary encoding of a natural number s), and a 
real number f3 > 1/s. 

All numbers are specified with poly(iV, s) bits of precision. 
The problem is to distinguish between the following two cases: 

• There exists an iV-fermion state a such that tr3^...^Ar(c) = P- In this 

case, answer "YES." 

• For all A'^-fermion states a, ||tr3^...^jv(c) — p\\i > P- In this case, answer 
"NO." 

If neither of these cases applies, then one may answer either "YES" or "NO." 

(Note that we use the £1 matrix norm, |m|i = tr \A\, to measure the distance 
between a and p.) 

An instance of this problem is described by a string of length £ = 0(d^ poly(A^, s) 
+s), and we say an algorithm solves the problem efficiently if it takes time polynomial 
in £. We claim that this formal definition is equivalent to our intuitive notion of what it 
means to solve the problem. Intuitively, an algorithm solves the problem efficiently if, 
on instances where N < d < poly(A'^) and (3 > 1/ poly(A^), the algorithm runs in time 
poly (AT). 
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Clearly, the formal definition implies the intuitive one, since on instances where 

N <d< poly(A^) and (3 > l/poly(iV), the length of the input is < poly(iV). 

To show that the intuitive definition implies the formal one, we use a padding argu- 
ment. Suppose the intuitive definition holds. Then, given an arbitrary instance of the 
problem, one can solve it in time polynomial in the length of the input, as follows. One 
modifies the problem to have q extra modes (energy levels) and q extra particles, and 
one modifies the 2-fermion state p to enforce the constraint that these q extra modes are 
always occupied. Also, one decreases the error parameter /3 by a factor of (d + q)^. This 
produces a new instance of the problem, which is equivalent to the old instance. In this 
way we can increase A'' and d so that the promises N < d < poly(A") and /3 > 1/ poly(A^) 
are satisfied, but N is still at most polynomially large compared to the length of the 
input. Then the problem can be solved in time poly(Af), which is polynomial in the 
length of the input. 

We also define the Fermionic Local Hamiltonian problem, as follows: 

Consider a system of N fermions, where each particle has d energy levels. We 
are given a 2-particle Hamiltonian A, which is a (2) x (2) Hermitian matrix 
with ll^ll < 1. In addition, we are given a string "1*" (the unary encoding of 
a natural number s), and two real numbers a and 6, such that b — a > 1/s. 

All numbers are specified with poly(A/", s) bits of precision. 

Define the A^-particle Hamiltonian to he H = ^i^j A^^-'\ restricted to the 
subspace of A^-fermion states. The problem is to distinguish between the 
following two cases: 

• li H has an eigenvalue that is < a, answer "YES." 

• If all the eigenvalues of H are > b, answer "NO." 

If neither of these cases applies, then one may answer either "YES" or "NO." 

Again, an instance of this problem is described by a string of length £ = 0(d^ 
poly(A^, s) + s), and we say an algorithm solves the problem efficiently if it takes time 
polynomial in £. This formal definition is cqTiivalent to our intuitive notion of what 
it means to solve the problem (using a padding argument, as above). Intuitively, an 
algorithm solves the problem efficiently if, on instances where N < d < poly(A^) and 
P > 1/ poly(A^), the algorithm runs in time poly(A/"). 

3.4 Our Results 

First, we show that any 2-local Hamiltonian of spins can be simulated using a 2-local 
Hamiltonian of fermions with d = 2N, and hence Fermionic Local Hamiltonian is QMA- 
hard. Then, using techniques of convex programming, we show that an efficient algo- 
rithm for A/"-representability would allow us to estimate the ground state energies of 
2-local Hamiltonians of fermions; thus, A^-representability is QMA-hard. 

One might expect that Fermionic Local Hamiltonian would be QMA-hard, but it is 
somewhat surprising to find that A^-representability, which was believed to be tractable. 



49 



is also QMA-hard. In fact, A^-representability is QMA-hard for precisely the same 
reasons that first attracted the interest of the quantum chemists: convex optimization. 
Previous work tried to formulate explicit 'W-representability conditions" that could be 
used in variational calculations. In this paper we use a more general framework, convex 
optimization with a membership oracle (see chapter 2) |881I38| . to show that any efficient 
solution to A^-representability is impossible unless QMA is tractable. 

Second, we show that the above two problems are in QMA. The natural "witness" 
for these problems is a fermionic state; using the Jordan- Wigner transform, this state 
can be represented using qubits, in such a way that its local properties can be efficiently 
verified by a quantum computer. This is similar to the techniques used to simulate 
fermionic systems on a quantum computer [70^ [23| [2] . 

3.5 Fermionic Local Hamiltonian is QMA-hard 

Theorem 3.2 There is a poly-time mapping reduction from 2-Local Hamiltonian to 
Fermionic 2-Local Hamiltonian. 

Proof: We show how to map a 2-local Hamiltonian, -ffqubit) defined on a system of N 
qubits, to a 2-local Hamiltonian on fermions, -fffcrmij with d = 2N modes, such that the 
ground state energy remains the same. (This is the opposite of what has been done in 
[82j.) 

We represent each qubit z as a single fermion that can be in two different modes 
aijbi] so each A^-qubit basis state corresponds to the following A-fermion state: 

l^i) • • • \zn) ^ ial)^^''iblr ■ ■ ■ (ajv)^-^^(6jv)^^|l^). (3.6) 

The fermionic Hamiltonian, i^fcrmii consists of two parts: H^, which "simulates" -ff qubit 
on the fermionic states shown above; and Hb, which enforces the constraint that there 
is exactly one fermion at each site i. 

First we construct H^. A Pauli matrix acting on qubit i corresponds to a bilinear 
function of the creation and annihilation operators: 

af ^albi + blaf, ^ i{b\ai - a\bi); ^ 1 - 2b\bi. (3.7) 

(Note: when we write erf, we mean an operator on all A^ qubits, which is a tensor product 
of (T^ on qubit i, and the identity matrix on the other A — 1 qubits.) The above operators 
commute with Oj and bj, for all j ^ i; hence they act correctly on the states in (13. 6p . 
We also consider products of two Pauli matrices acting on qubits i and j, e.g., f^fo'j- 

This corresponds to a product of two fermionic operators, e.g., (a|6i -|- 6|aj)(l — 2bjbj). 
(Note that erf (t| is equal to the tensor product of on qubit i, on qubit j, and the 
identity matrix on the other N — 2 qubits.) 

-f^qubit can be written as a linear combination of terms of the form af and o'i'O'j , 
where u,v £ {x, y, z] and i, j G {1, . . . , A}. We then construct Ha by substituting the 
corresponding fermionic operators. 
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Next we construct Hb- We want to guarantee that, for each exactly one of the 
modes Oj and hi is occupied. This can be achieved by setting Hb = 'Ylf=i where 

Iii = l + {2a\ai-l){2h\hi-l). (3.8) 
To see why this works, note that 11 j is diagonal in the basis consisting of the states 

and has eigenvalue 2 if Sj = tj, and eigenvalue if Sj 7^ ij. 

In addition, we claim that all of the Ilj are biquadratic and commute with all of 
the operators introduced in (j3.7p . (To see this, consider how the operators in (j3.7p act 
on the eigenstates of Ilj. Observe that each operator in (|3.7p maps a 0-eigenstate to a 
0-eigenstate, and maps a 2-eigenstate to a 2-eigenstate.) 

The full Hamiltonian -fffermi is given by 

-f^fermi = Ha + I^Hb , 

where /? is a real number which we will choose later. We claim that -fffermi has the 
same ground state energy as -ffqubit- We know Ha and Hb commute, so -fffermi is block- 
diagonal with respect to the eigenspaces of Hb- Note that the eigenvalues of Hb are 
0, 2, 4, . . . , 2N . Now set (3 equal to a constant times the norm of Ha- This guarantees 
that the ground state of ^^fermi will lie in the 0-eigenspace of Hb , so it will have exactly 
one fermion per site. Thus the ground state of ^^fermi corresponds to the ground state 
of -ffqubit) and they have the same energy. 

Finally, note that ||^^fermi|| < ©(A^^H-fTqubitlD- (To see this, note that ||-?/fcrmi|| < 
0(||i?A||)- We constructed Ha from ffqubit by writing -ffqubit as a linear combination 
of Pauli matrices; there were 0{N'^) terms in the sum, each having norm 0(||//qubit||); 
hence < 0(iV2||^^^^.^||)_) 

Also, note that -f^fermi only contains terms with at most 2 annihilation and 2 creation 
operators. Thus it is a 2-local fermionic Hamiltonian. □ 

Since 2-Local Hamiltonian is QMA-hard [51], this implies that Fermionic 2-Local 
Hamiltonian is QMA-hard. 

We remark that this mapping from qubits to fermions may have other applications. 
For instance, one can show that adiabatic quantum computation on fermionic systems is 
universal0 One direction is already known: one can use a quantum circuit to simulate 
the time evolution of a local Hamiltonian of fermions \70\ [23l [2]. We can show the 
reverse direction as follows: to simulate a quantum circuit, first construct an adiabatic 
local Hamiltonian on qubits f9], then use the above mapping to translate it into an 
adiabatic local Hamiltonian on fermions. We claim that this mapping preserves the gap 
between the two lowest energy levels. To see this, observe that the energy spectrum of 
-f^fermi contains an exact copy of the spectrum of -ffqubit (in the 0-eigenspace of Hb), along 
with other higher energy levels (in the other eigenspaces of Hb)- Thus the low- lying 
energy levels of -fffermi and -ffqubit are identical. 

^Thanks to Stephen Jordan for pointing this out. 
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3.6 A^-representability is QMA-hard 



3.6.1 Convex Optimization with a Membership Oracle 

First we review the basic result of chapter 2: given a membership oracle for a closed 
convex set K C M", one can solve the optimization problem over K in polynomial time. 
This holds provided that K contains a ball of radius r centered at a known point p, and 
K is contained in a ball of radius R centered at the origin, such that R/r < poly(n). 
Furthermore, the precision required for the membership oracle depends polynomially 
on the precision desired for the solution of the optimization problem. Formally, we say 
that WOPTe poly-time reduces to WMEM^, for some 5 > poly(e, {r/R), (1/n)); this is 
Proposition 12. 8i 

We rephrase this result slightly, so it will be more convenient to use later. First, we 
define a variant of the weak optimization problem, WOPT* , as follows: 

Given c S M", ||c|| = 1, 7 G M, and e G M, e > 0, all specified with poly(n) 
bits of precision. 

If there exists a vector y £ K with c • y > 7 + e, then answer "YES." 
If for sdl X £ K , c ■ X < 'J — e, then answer "NO." 

(This problem differs from WOPT,. in that y does not have to be deep inside K, and 
we no longer consider x that are slightly outside of K.) We also define a variant of the 
weak membership problem, WMEM^ , as follows: 

Given y G M", and 5 G M, 5 > 0, all specified with poly(n) bits of precision. 
IfyGK, then answer "YES." 
If y ^ S{K,6), then answer "NO." 

(This problem differs from WMEAIg in that y does not have to be deep inside K.) 
We show the following result: 

Theorem 3.3 Let K he any closed convex set in M", such that S{p,r) C C S{0,R), 
as defined above. Then there is an oracle reduction from WOPT* to WMEM^ , for 
some 5 > poly(e, {r/R), (1/n)), which runs in time poly(n, {R/r), {1/e)). 

Proof: First we show a mapping reduction from WOPT* to WOPT(^^^njiy The re- 
duction is trivial — we only change the value of the parameter e. Suppose we have a 
"YES" instance of WOPT* , i.e., there exists x £ K such that c ■ x > 7 -|- e. De- 
fine x' = (1 — 5)x + 5p, for some 6 to be chosen later. Then S{x',5r) C K, and 
c-x' = {l-5)c-x + 5c-p>-i + e- 26R. Now set S = e/AR. Then S{x' , {er/AR)) C K, 
and c • x' > 7 e/2, so this is a "YES" instance of WOPT. 

Now suppose we have a "NO" instance of WOPT* , i.e., for all x G i^, c • x < 7 — e. 
This implies that for any x' G S{K,e/2), c • x' < 7 — e/2, so this is a "NO" instance of 
WOPT. 

Next, we use Proposition 12.81 to get a reduction from WOPT to WMEM. Finally, 
WMEM trivially reduces to WMEM*. □ 
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3.6.2 A^-representability is QMA-hard 

Theorem 3.4 There is a poly-time oracle reduction from Fermionic 2-Local Hamilto- 
nian to N -representability. 

Proof: Let us now assume that we have an efficient algorithm for A^-representability. We 
claim that this allows us to efficiently determine the ground state energy of any 2-local 
Hamiltonian on fermions, -fffermi- The basic idea is to find the ground state of -fffcrmi using 
convex programming. However, instead of the full A^-particle density matrix, we will just 
find the 2-particle reduced density matrix, subject to the A^-representability constraint. 
The resulting convex program has polynomially many variables, and by assumption we 
have an algorithm that can test whether the A^-representability constraint is satisfied. 
Thus this program can be solved, using convex optimization with a membership oracle. 

We now describe the details. First, note that the interesting behavior in -fffermi occurs 
in the subspace of states with exactly A'^ particles. (We are assuming that Hicrvai comes 
from the reduction given in the previous section.) Restricting ourselves to this subspace, 
we have the identity a\aj = 7Y^^l(X^fc ^k'^k)0'ji and we can write all the terms in -fffermi 
in the form a\a\aiak- 

We can view -fTfermi as describing a system with an arbitrary number of particles; 
-f^fermi simply Specifies a 2-particle interaction, which acts on all pairs of particles. (Note 
that, when written in second-quantized notation, -fTfcrmi has the same form irrespective 
of the number of particles.) In particular, we can view -fffermi as describing a system of 2 
particles. Now, suppose this system is in state p, and suppose that p is A^-representable, 
that is, there exists an A^-particle state a such that tr3^...^7v('7) = P- Then, using the 
identity (j3.2p for the matrix elements of the 2-RDM, we have that 



This says that the 2-particle state p has the same energy as the A^-particle state a, scaled 
by a factor of 1/A^(A^ - 1). 

We construct a convex program that finds a 2-fermion density matrix p that is A^- 
representable, and that minimizes tr(i7fei-miyo)- This tells us the ground state energy 
of -fffermi (for the A^-particlc system). Note that this program has polynomially many 
variables, the set of A^-representable states is convex, and tr(//fermiP) is a linear function 
of p. Assuming that we have an efficient algorithm for A^-representability, we claim that 
we can solve this convex program in polynomial time. 

One technical point concerns the geometry of the set K of feasible solutions. The set 
K must be full-dimensional, i.e., K cannot lie in a lower-dimensional subspace. (We also 
need K to have outer radius R and inner radius r, such that R/r is at most polynomially 
large; we will revisit this issue later.) So we have to represent the 2-fermion state p in 
such a way that there are no redundant variables. To this end, let S be the complete set 
of 2-particle observables introduced in section [3.2.21 and let i = \S\; note that £ < d'^. 

We represent p in terms of its expectation values as = tv{Sp), for all observables 
S" E 5. Let Q E denote the vector of these expectation values, a = {as)s&s- Then we 
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define K to be the set of all vectors q G such that the corresponding 2-fermion state 
p is A^-representable. Note that the A^-representability algorithm lets us test whether a 
given point a is in K. 

We write our Hamiltonian in the form 



Since we are viewing i^fermi as an operator on the space of 2-particle states, we have 
that the operators S G S are orthogonal (see section I3.2.2p . So the coefficients 70 and 
75 are given by the formulas 



So our convex program can be written as follows: find some a £ K that minimizes the 
function /(a) = 70 + 7 • a. 

For future reference, let us bound the size of 70 and 7. Recall that, when restricted 
to the space of 2-particle states, the operators S £ S have rank 1 or 2, with eigenvalues 
1 or ±1 (see section [3.2.2p . So |7o| < H-f^fcrmilli and I75I < | tr(//fcrmi'S')| + |7otr(S')| < 
3||-f^fermi||- Also, recall that ^^fermi is defined by a 2-particle interaction A where \\A\\ < 1. 
Since we have only 2 particles, -fffermi = A, hence ||-fffermi|| < 1- So I70I < 1 and |75'| < 3, 
and hence ||7|| < 3\/£. 

Given an algorithm for A^-representability, we can solve the above convex program 
(and thus Fermionic Local Hamiltonian) in polynomial time. The logic is as follows: 
Fermionic Local Hamiltonian reduces to the weak optimization problem WOPT* on 
the set which reduces to the weak membership problem WMEM* on the set K, 
which reduces to A^-representability. Numerical precision is a concern here, because the 
algorithm for A^-representability is allowed to make mistakes near the boundary of the 
set K. We claim that, in order to solve Fermionic Local Hamiltonian with error 6 — a, we 
require an algorithm for A^-representability with error /3, where (3 > poly((6 — a), 1/d). 
Also, the overall reduction runs in time poly(d, 1/(6 — a)). 

The first and last steps in the reduction are easy, using the definitions of WOPT* 
and WMEM* . Using the remarks above, we have that Fermionic Local Hamiltonian 
(with error b — a) reduces to WOPT* with e > ■ • And, using Lemma 

13.11 WMEM^ reduces to A^-representability with error 

The middle step in the reduction makes use of Theorem l3.3^ and requires some further 
explanation. This step requires a guarantee that K is contained in a ball of radius R 
centered at 0, and K contains a ball of radius r centered at some point p, such that 
R/r is at most polynomially large. In our case, we have the following bounds, which we 
prove in the next section. 
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Lemma 3.5 K is contained in a ball of radius R = Vi, and K contains a hall of radius 
r = l/ed^. 

(Also recall that £ < d^.) Substituting into Theorem 13.31 we get that WOPT* reduces 
to WMEM^ with 6 > poly(e, 1/d). 

Thus, given an efficient algorithm for A^-representability, we get an efficient algorithm 
for Fermionic Local Hamiltonian. This completes the proof that A^-representability is 
QMA-hard. □ 

3.6.3 Bounds on the Geometry of K 

Proof of Lemma 13.51 We claim that K is contained in a ball of radius R = V7, and K 
contains a ball of radius r = 1/i'^d^. 

The first statement is easy to see, since for all a £ K, and for all 5" G 5, we have 
-l<as< 1. 

The second statement is less trivial. The obvious argument is as follows: let a 
be the maximally mixed state on particles, let p be the corresponding reduced 2- 
particle state, and show that for any small perturbation of p, one can perturb cr in a 
way that agrees with p. But this argument runs into complications, because it is hard to 
perturb cr in a way that affects just two modes; one usually ends up affecting A^ modes 
simultaneously. 

Instead, we use the following indirect argument. (We first sketch the overall argu- 
ment, then fill in the details.) We consider A^-representability for different values of A^; 
let Kn denote the set of all vectors a that are A"-representable. We also define the 
"particle-hole" observables, where the roles of Oj and a| are reversed. Let S' be the set 
of 2-hole observables, 

Xjj = a/a J + aja\, for all I ~< J, 
Yjj = —iaja^j + iaja\, for all I ^ J, 
Zj = aia\, for all / except the last one. 

Let a' denote a vector containing expectation values for these observables, and let K'j^ 
be the set of all a' that are A^-representable. 

It is easy to see that K2 contains a ball of radius 1/ poly(^) (this is the trivial case). 
Now, using the anti-commutation relations, we can write each 2-particle observable as 
a linear combination of 2-hole observables, and vice versa. (This holds for states where 
the total number of particles is fixed.) This implies an invertible linear transformation 
A that maps K2 to K2. We show that this transformation does not shrink K2 by more 
than a polynomial factor. 

Next, note that jjz^(d^^2 ~ since a state with 2 holes can be viewed as 

a state with d — 2 particles. (There is also a normalization factor, to account for the 
increase in the number of particles.) Thus K(i-2 contains a ball of radius 1/ poly(^). 
Also, note that if a vector a is A^-representable, then it is also (A^ — l)-representable; so, 
for all 3 < A^ < d — 2, we have K]\[ C ATtv-i- Thus, for all 2 < A^ < d — 2, contains 
a ball of radius 1/ poly(£). This completes the argument; now we describe the details. 
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First, some remarks about the definition of the set K^. We define 

Kn = {a G I there exists a 2-particle state p, 
such that p is A/"-representable, 
and for all observables S ^ as = tr(5p)}. 

We can also describe Kn in terms of an A?^-particle state cr, where tr3,...,iv(<7) = p. 
However, some care is needed with the normalization factor for the expectation values 
tr(5'cr). Recall that 

Pijkl = N{N-i) tr(4aj"ajaj(7) = \ti{ala\ajaip). 
Thus tr(5(T) = ^^^2 ^'^ tr(6'p). So Kn is given by 

Kn = {a € I there exists an A^-particle state a, 

such that for all observables S & S, as = n{n-i) 

ti{Sa)}. 

The definition of the set K'j^ is exactly the same, but using the set of observables S' 
in place of 5. 

We claim that K2 contains a ball of radius l/poly(£) (we will give a precise bound 
below). Note that this is the trivial case of A/"-representability; K2 is the set of all vectors 
a that correspond to 2-particle fermionic states. Consider the vector a that corresponds 
to the maximally mixed state on two fermions, a = 1/ (2) . The components of the vector 
a are given by 

a(z,)=tr(Zja) = l/(^), 
"(Xjj) = tr(X/jcr) = 0, 

We claim that, for any perturbation a+ff, \\if\\ < 1/ poly(£), we can perturb a in such 
a way that it agrees with a + ff. We construct this perturbation as follows. Recall that 
when we defined the set of observables S, we chose an ordering on all the pairs of modes. 
Let L denote the pair of modes that comes last in this ordering. (Also recall that we 
excluded the observable Zl from the set S.) Now consider the following perturbation: 

^' = ^ + "^miZi -Zl) + \Y^ '^{Xij)^iJ + Wij)^iJ- 

I~<L I~iJ I~iJ 

Here we view {Zj — Zl), Xjj and Yjj as operators on the space of 2-particle states. 

This is a legal density matrix (positive scmidcfinitc with trace 1), provided that 
f] < l/£d?. To sec this, note that the operators {Zj — Zl), Xjj and Yjj have trace 
and norm at most 1, and note that a = 1/(2) • 

Also, we have that a' agrees with a + ff, that is, 

tr(Z/ cr') = a^zi) +ViZi), 
ti{Xija') = a^xij) + riiXij), 
ix{Yija') = a(Y,j)+r](Yjjy 
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This follows from the orthogonality properties of Z/, Xjj and Y/j, shown in section 
13.2. 2[ (We emphasize that we are viewing these as operators on 2-particle states. Zj 
and Zp are not orthogonal when we view them as operators on A^-particle states.) 
Thus we have shown that K2 contains a ball of radius l/icP. 

Next, we construct an invertible linear transformation A that maps K2 to We 
begin with the following identity, which comes from repeated application of the anti- 
commutation relationso 

alaladttc = dbd^ac - ^ad^bc + ^ad^cal + Sbcadol - ^acad^l ~ ^bdacol + adacolal. 

Thus if we write / = {a, b} and J = {c, d}, we get the following expressions for a\aj: 

a\a J = a ja\, if / n J = 0, 

= 1 - abol - aaal + a/a|, if / = J, 
= —ttcol + a ja\, if a 7^ c and b = d, 
etc. 

This shows that a\aj, which is a 2-particle operator, can be written as a linear combina- 
tion of 1-hole and 2-hole operators. Now we restrict all operators to act on the space of 
states with exactly 2 particles (or equivalently, d — 2 holes). Then we have the identity 

So a 1-hole operator can be written in terms of 2-hole operators. Substituting into 
the previous equation, we get that any 2-particle operator can be written as a linear 
combination of 2-hole operators. 

Furthermore, the 2-particle observables Zj, Xjj and Yjj can be written as linear 
combinations of the 2-hole observables Zj, Xjj and Y/j (note that Xjj is constructed 
from a\aj and its adjoint; Yjj is similar). Thus the expectation values of the 2-particle 
observables are linear functions of the expectation values of the 2-hole observables. So 
we have a linear transformation that maps K2 to K2; this is A~^. 

Similarly, any 2-hole operator a/Oj can be written as a linear combination of 2- 
particle operators a|^,^a(//). The argument is almost the same as before: first we use 
the anticommutation relations, then we use the identity 

B/Oe = ( ^ alag)a'jae = ^ a}ja\agae 

to replace 1-particle operators with 2-particle operators. This allows us to construct the 
linear transformation A that maps K2 to K2. 

We now show that the linear transformation A does not shrink K2 by more than a 
polynomial factor. Write the singular value decomposition A = UDV, where U and V 

^Note that the subscript a refers to one of the modes, while a in regular type is an annihilation 
operator. 
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are unitary, and D is diagonal, with diagonal entries Da > 0. Let B = A . Looking at 
the matrix elements of 5, we can see that 

e 

At the same time, 

tr{B^B) = tT{UD-^VV~'^D-'^U-^) = tr(£>-2) > D^:'^, 

for all i. So we have Da > 1/id, for all i. That is, A does not shrink by more than a id 
factor in any direction. 

This implies that K2 contains a ball of radius 

Next, we show that ((j_2)^(d-3) ^'2 = Consider what happens when we exchange 

the creation operator a| with the annihilation operator Oj, for each mode i. This trans- 
forms 2-hole observables into 2-particle observables, and vice versa. In addition, this 
transforms 2-particle Slater basis states into {d— 2)-particIc Slater basis states, and vice 
versa: the 2-particle state with modes i and j occupied corresponds to the (d— 2)-particle 
state with modes i and j empty. 

So take any point a e K2, which represents the expectation values of the 2-hole ob- 
servables for some 2-particle state a. Use a to construct the corresponding (ti— 2)-particle 
state T, as described above. Then the expectation values of the 2-hole observables for a 
are exactly the expectation values of the 2-particle observables for r. So (^(i-2){d-3) ^ ™ 
Kci-2 ■ (Note that we normalize a to account for the increased number of particles — see 
the definition of K^.) This shows that (t;_2)^(rf-3) ^2 ^ Kd-2- A similar argument shows 
that ^ {d-2)(d-3) ^2- This proves the claim. 

Hence i^(i-2 contains a ball of radius 1/i'^d^. 

Next, we show that Kj^ C iCjv-i, for all 3 < iV < d — 2. Take any point a. G Kj^, 
which represents the expectation values of the observables S £ S for some 2-particle 
state p, where p is A''-representable. But if p is A^'-representable, then it is also (A^ — 1)- 
representable. To see this, take some A'-particle state a, such that tr3_...^iv(<7) = p; 
trace out the A^'th particle to get an {N — l)-particle state a' = trjv(o"); and note that 
tr3,...,Ar-i(o-') = p. Thus d G -f^iv-i) which proves the claim. 

Hence ATjv contains a ball of radius l/£^c?^, for all 3 < A < d — 2. □ 

3.7 Fermionic Problems in QMA 

Theorem 3.6 Fermionic Local Hamiltonian and N -representability are in QMA. 

Proof: A problem is in QMA if there exists a poly-time quantum verifier V that takes 
two inputs: a description of the problem x, and a "witness" r (which is a quantum state 
on polynomially many qubits). V should have the following property: if a; is a "YES" 
instance, then there exists a state r that causes V to output "true" with probability 
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> Pi; if X is a "NO" instance, then for all possible states r, V outputs "true" with 
probability < pq; and Pi — ^ 1/ poly(-^)- 

Suppose we have a "YES" instance of Fermionic Local Hamiltonian or A^-represent- 
ability. Intuitively, the witness should be an A/'-fermion state a (i.e., the ground state 
of the fermionic Hamiltonian, or the A'^-fermion state that agrees with the given 2- 
RDM). Then the verifier works by measuring 2-fermion observables (we will discuss the 
measurement procedure later). However, the standard model of quantum computation 
uses qubits, so we need to represent the fermionic state a using qubits, in such a way 
that the fermionic observables can be implemented efficiently. 

We represent the fermionic state a using d qubits, via the following mapping: 

Call the resulting qubit state a. Note that, if a has exactly N fermions, then a lies in 
the subspace of states . . . , id) where ii + ■ ■ ■ + id = N. 

We use the Jordan- Wigner transform to map the fermionic annihilation operators Oj 
to qubit operators Ai: 

i-l 

ai^^i = -((g)a^) ®|0)(l|i. 

k=l 

Likewise, 

«l-4 = -((2)^^)®|i>(oi.. 

k=l 

One can check that the action of Ai on the qubit states agrees with the action of ai on 
the fermionic states (recall the definition of Oj in section 13.2. ip . 

Thus, we can transform a fermionic observable O = a\a)-aiak + a)j^a^iajai into a qubit 

observable O = A\A^-AiAk + A\A\AjAi. This is a tensor product of many single-qubit 
observables and one four-qubit observable, so it can be measured efficiently. Similar 
arguments apply for all of the 2-fermion observables in the set S (introduced in section 
[3X2]) . 

We now describe the verifier V . This is quite similar to the verifier for the Local 
Hamiltonian and Consistency problems on qubits (see chapter 2). The witness r consists 
of several (i.e., polynomially many) blocks, where each block has d qubits, supposedly 
representing one copy of the state a. The verifier V acts as follows: 

On each block, V first measures the observable T = X]fc=i ^"^^ if the 

outcome does not equal N , V outputs "false." This projects each block 
onto the space of A'^-fermion states. 

Next, in the case of Fermionic Local Hamiltonian, V transforms the fermionic 
Hamiltonian H into a qubit operator H (note that H \s a. linear combi- 
nation of the 2-fermion observables 5 G 5), then uses phase estimation 
to estimate the expectation value of H for the state a. V compares this 
with the energy threshold specified in the description of the problem, and 
outputs "true" or "false" accordingly. 
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In the case of A^-representability, V picks a fermionic observable S £ S at 
random, transforms it into a qubit observable S, and measures it on each 
block to estimate the expectation value for the state a. V compares this 
with the expectation value for the state p specified in the description of 
the problem, and outputs "true" or "false" accordingly. 

The analysis of the verifier V uses the same arguments as in chapter 2. One technical 
difference is the use of the local fermionic observables S £ S, rather than the local Pauli 
matrices; however, the observables S £ S can be used in a similar way to extract 
information from the witness a (see section [3.2.2p . It is straightforward to see that, on a 
"YES" instance, given the correct witness r = a®'', the verifier V outputs "true." On a 
"NO" instance, the situation is more complicated: given an arbitrary state r, we want V 
to output "false." First, note that if the measurement of the observable T returns a value 
different from on some block, then V automatically returns "false." So without loss 
of generality, we can assume that r lies in the simultaneous eigenspace of the observables 
T (with eigenvalue N) on all the blocks. In other words, r has exactly fermions on 
each block. However, r might not be a tensor product state, i.e., the different blocks 
could be entangled. But this does not fool the verifier, by the same Markov argument 
as in chapter 2 (originally due to [8j). 

Thus we have that Fermionic Local Hamiltonian and A^-representability are in QMA. 

□ 

3.7.1 Pure-state A^-representability is in QMA(2) 

The pure-state A^-representability problem is similar to the A^-representability problem, 
but with the extra constraint that the A^-particle state must be pure. 

In addition to p and /3, we are given a real number 6 > l/poly(iV), specified 
with poly(A^) bits of precision. We have to distinguish between these two 
cases: 

• There exists an A^-fermion state a such that a is pure (hence tr(cr^) = 1) 
and tr3^..._Ar((T) = p. In this case, answer "YES." 

• For all A^-fermion states cr, either tr((T^) < 1—5 or ||tr3^...^7v(c")— /o||i > /?. 
In this case, answer "NO." 

Note that we use tr((T^) to measure the purity of the state a, and we allow an error 
tolerance S > 1/ poly(A^). 

The class QMA (2) is similar to QMA, except that here the verifier V receives two 
unentangled quantum witnesses, r and r/ (so the combined state is t ® rj) |58j . V is 
required to have the following property: if x is a "YES" instance, then there exists a 
product state r^rj that causes V to output "true" with probability > pi; if x is a "NO" 
instance, then for all possible states of the form T®r],V outputs "true" with probability 
< po; and pi — > 1/ poly(A^). (Note that for a QMA(2) verifier, it is not known 
whether one can use parallel repetition to amplify the gap between the probabilities pi 
and po-) 
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Proposition 3.7 Pure-state N -representahility is in QMA(2). 

Proof: First we describe the "swap test." Given two unentangled states u and ry, on two 
quantum systems of equal dimension, the swap test allows us to estimate the quantity 
tr(i/ry). Let Swap denote the operation of exchanging the two systems. This is a unitary 
operation, but it is also Hermitian, and it can be viewed as an observable with eigenvalues 
1 and —1. Thus one can measure the Swap observable, using the same procedure 
for measuring the Pauli matrices (see section 12. 2p . This procedure returns "0" with 
probability ^ + ^ti{Swap{i>(2)'r])), and "1" with probability ^ — ^tv{Swap{u^ri)). Then 
a straightforward calculation shows that 

ti{Swap{i' 1])) = tr((/ (g) v)Swap{I ® rj)) 
= tv{Swap{I {rjv))) 
= tr^-qv) = tr(i/?7). 

The swap test can be used to check the purity of the state ly, as follows. If is pure, 
and 7] = h', then tic{h'r]) = tr(z^^) = 1, so the test returns "0" with probability 1. But if 
u is not pure (and in particular tr(z^^) < 1 — e), then for all states ry, 

tr(i/??) < x/tr(z^2)"tr(^ < < 1 - s/2, 

so the test returns "0" with probability < 1 — e/4. (Intuitively, ij serves as a "witness" to 
the purity of the state u. Note that it is essential that i/ and ij are independent states.) 

Now we describe the verifier for pure-state A^-representability. The witness is a 
product state t ® rj, where r is the usual witness for A^-representability, consisting of 
polynomially many blocks, while r/ consists of a single block, which is guaranteed to 
be unentangled with r. (Each block consists of d qubits, and supposedly represents a 
copy of the A'^-fermion state a (or, to be precise, the corresponding qubit state a).) The 
verifier V works as follows: 

First, V measures the observable T = X]^^]^ |l)(l|fc on each block, and if the 
outcome does not equal N , V outputs "false." This projects each block 
onto the space of A^-fermion states. 

Then V flips a coin, and does one of two things with equal probability. 

If the coin comes up "heads," V discards the state rj, and performs the usual 
verification procedure for A^-representability on the state r (i.e., V uses 
r to estimate the expectation values of the 2-fermion observables) . 

If the coin comes up "tails," V picks one block of r, uniformly at random, 
and discards the rest of r. This produces the state r* = (1/r) X]j=i 
where r is the number of blocks, and r(^') is the reduced state of the j'th 
block. V now has the state r* ® rj, and V checks the purity of r*, using 
the swap test as described above. 

On a "YES" instance, given the witness t ®rj where r = a®'' and rj = a, the verifier 
V returns "true" with probability close to 1. On a "NO" instance, for any witness of 
the form r 77, we claim that V returns "true" with probability bounded away from 
1. Without loss of generality, we can assume that r and rj lie in the subspace of states 
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with exactly N fermions per block. However, r might be an arbitrary entangled state 
(not an r-fold product state). Nonetheless, we consider the state r* = (I/?') X]j=i ''"'■''^ 
(defined above) on a single block. Both the purity test and the A^-representability test 
act on this state. Since this is a "NO" instance, we know that either tr((r*)^) <1 — 5 oi 
l|tr3,,,,,Ar(T*) — p\\i > 13 (note that we are abusing notation, using r* to denote both the 
A'^-fermion state and its representation as a qubit state) . Hence either the purity test or 
the A'"-represent ability test will fail with significant probability, so V will return "false." 
□ 

3.8 Discussion 

3.8.1 Related Work in Quantum Information 

It is remarkable that checking consistency of 2-body reduced density operators is so 
hard, while checking consistency of 1-body reduced density operators is simple [28j . This 
can be understood from the previous discussion: intuitively, 1-body density operators 
(a|aj) correspond to Hamiltonians only containing bilinear terms in a\ and aj] such 
Hamiltonians can easily be diagonalized as they represent systems of free fermions. As 
shown in [28], consistency can be decided in that case based solely on the eigenvalues 
of the reduced density operators. A number of related problems have been investigated 
recently [42l [I9l |26l EH IH] ; in particular, see [56] . 

These results have to be contrasted with our problem of deciding A'^-representability 
for 2-body density operators, where the eigenvalues alone are not enough to decide consis- 
tency but also the eigenvectors are relevant. Actually, let us consider the simpler problem 
where only the diagonal elements of the 2-body density operators, Dij = {a\a}jajai) , are 
specified. Using the mapping from spins to fermions discussed above, one easily finds 
that these Dij correspond to local spin Hamiltonians which only contain commuting 
operators. These are spin-glasses, and so the problem of deciding A^-representability of 
{Dij} is NP-hard [13]. It was indeed pointed out a long time ago that A^-representability 
restricted to the diagonal elements is equivalent to a combinatorial problem [87| that 
was later shown to be equivalent to the NP-hard problem of deciding membership in the 
boolean quadric polytope [33] . 

3.8.2 Applications to Quantum Chemistry 

There are various methods for calculating the 2-RDM corresponding to the ground state 
of a molecular system |291 Wf\ I64j . These methods necessarily involve solving some 
instances of the A^-representability problem. Typically, one imposes a set of constraints, 
called A^-representability conditions, which can be efficiently computed, but only give an 
approximation of the true set of A^-representable 2-RDM's. For example, one can impose 
positivity constraints on the p-particle reduced states, where p is a small constant, say 
2 or 3; these are called p-positivity conditions. One can then perform a variational 
minimization, or use a more sophisticated method such as the contracted Schrodinger 
equation (CSE). In the CSE method, one first integrates the A/'-particle Schrodinger 
equation to get an equation that relates the 2-, 3- and 4-RDM's. The 3- and 4-RDM's 
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can then be approximated in terms of the 2-RDM, and one can solve for the 2-RDM 
using an iterative procedure. Here, the A^-representability conditions are expressed in 
the approximate reconstruction of the 3- and 4-RDM's from the 2-RDM, and in the 
iterative procedure. 

We have shown that finding ground state energies by means of the A^-representabihty 
problem is intractable in the worst case. This leaves open the possibility of finding 
efficient algorithms that give accurate results for particular physical systems (though 
they must break down in the general case). The hope is that some physical systems may 
have special features that make the problem easier. One example is one-dimensional 
translational invariant spin systems, where the density matrix renormalization group 
allows for a systematic approximation of the convex set of allowed reduced density 
operators from within [8.1J. Also, for some molecular systems, variational minimization 
using 3-positivity conditions gives promising results |65j : this gives an approximation 
of the convex set from the outside. The non-variational CSE method looks promising 
as well, and is especially intriguing, as it combines p-positivity ideas with a particular 
ansatz for the A^-particle wave function; see [66] for a recent development in this area. 

It would be very interesting to investigate the conditions under which these approx- 
imations are justified. While there is empirical evidence that these methods work well, 
it seems that certain questions — especially concerning the accuracy of these methods 
on larger molecules — can only be answered through a better theoretical understanding. 
Most of the previous work has focused on applying these methods to small molecules or 
simple "toy models," and measuring the accuracy of the results against those obtained 
from brute- force calculations (full configuration interaction) or exact analytic solutions. 
However, based on this evidence it is hard to predict how well these methods will scale to 
larger, more complex molecules. In particular, does the accuracy decrease when we move 
to larger molecules? Ideally, one would wish to have some guarantee of the accuracy of 
the result, in cases where the true ground state energy is not already known. 

It may be that, on larger molecules, there is a tradeoff between the speed and accu- 
racy of these numerical methods. (For instance, one can always improve the accuracy by 
using p-positivity conditions with larger p, but the complexity grows exponentially with 
p; and indeed, in practice, 3-positivity conditions are much more computationally inten- 
sive than 2-positivity conditions.) Although it is very hard to answer these questions 
completely, theoretical investigations may shed some light. 

Finally, we remark that there are proposals for finding ground state energies of molec- 
ular systems by using a quantum computer [11^13]. These methods offer an exponential 
speedup, in that the quantum computer can actually represent the full iV-particle state, 
and measure its energy via phase estimation. However, to prepare an approximate 
ground state on the quantum computer, one must use heuristic methods, such as adia- 
batic evolution starting from the Hartree-Fock ground state. These heuristic methods 
are not expected to work in all cases, which is consistent with our result that Fermionic 
Local Hamiltonian is QMA-hard. 

In conclusion, we investigated the problem of A^-representability, and characterized 
its computational complexity by showing that it is QMA-complete. Obviously, the theory 
of quantum computing was a prerequisite to understanding the complexity of this classic 
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problem. 
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Chapter 4 



The Consistency Problem for 1-D 
and Stoquastic Systems 

(This chapter contains preUminary results. It has been superseded by the paper: Y.- 
K. Liu, "The Local Consistency Problem for Stoquastic and 1-D Quantum Systems," 
ArXiv:0712.1388 [quant-ph], 2007.) 

4.1 Introduction 

Previously we showed that Consistency is QMA-complete, which implies that the Con- 
sistency and Local Hamiltonian problems have the same complexity (up to poly-time 
oracle reductions). In this chapter we will prove similar statements about some special 
cases of these problems, which are not known to be QMA-hard, and in fact seem to 
be strictly easier than QMA. We consider the Local Hamiltonian problem for certain 
1-dimensional spin chains, and also for so-called "stoquastic" systems; these cases are 
not known to be QMA-hard. We show that there are corresponding special cases of the 
Consistency problem that have the same complexity, up to poly-time oracle reductions. 

One direction is easy: Local Hamiltonian reduces to Consistency, using the same 
techniques as in the previous chapters. But the opposite direction, reducing Consistency 
to Local Hamiltonian, is nontrivial. In the general case, we could get such a reduction 
using the QMA-hardness of Local Hamiltonian; but we want a reduction to a special case 
of Local Hamiltonian that is not QMA-hard. Here we devise a different reduction from 
Consistency to Local Hamiltonian, that works in these special cases. This reduction uses 
convex optimization with a membership oracle, combined with a new trick: a connection 
between Local Hamiltonian and Consistency, via Lagrange duality. (This is section 4.2.) 

This duality idea is similar to recent work by Hall [41j on the "subsystem compat- 
ibility problem." This problem is very much like Consistency, except that the input 
consists of density matrices describing all subsets of size n — 1 (for a system of n qubits) , 
rather than subsets of size k for some constant k. Thus the input is exponentially large 
in n, and the problem can be solved in time polynomial in the length of the input. In 
contrast, for the Consistency problem, the input is polynomially large in n, and we show 
a poly-time reduction to Local Hamiltonian. 
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Then we apply these ideas to the special case of one-dimensional spin chains. Specif- 
ically, we have n qudits (a qudit is a d-dimensional particle), arranged in a line with 
nearest-neighbor interactions (that is, interactions between particles i and i + 1, for 
i = 1, . . . ,n — 1). Many simple models studied in condensed-matter physics are of this 
form, and moreover there are heuristic methods, such as the density-matrix renormal- 
ization group (DMRG), which solve these models efficiently in practice |73]. Although 
the performance of these heuristics is not fully understood, this experience suggested 
that 1-D systems are computationally tractable, in contrast to systems in 2 or more 
dimensions. (One rigorous result along these lines is given by [Tl].) So it was a surprise 
when Aharonov, Gottesman and Kempe showed that Local Hamiltonian on a 1-D chain 
of qudits (with d = 12) is QMA-hard [H [45]. It is still an open problem whether the 
problem is QMA-hard for smaller values of d, and for qubits in particular. 

We define the Consistency problem on a 1-D chain of qudits, where we are given 
density matrices describing each pair of adjacent qudits. We show that, for a 1-D chain 
of qubits {d = 2), Consistency and Local Hamiltonian have the same complexity (up to 
poly-time oracle reductions). We also sketch how this result can be generalized to a 1-D 
chain of qudits {d > 2). (This is section 4.3.) 

We remark that the complex behavior of 1-D quantum systems is a sharp contrast 
to what happens in the classical world. For instance, Max-2-SAT, which is the classical 
analogue of Local Hamiltonian, is poly-time solvable when restricted to a 1-dimensional 
chain [6]. Also, inference in graphical models can be solved exactly in poly-time when 
the underlying graph is a tree. This has an intuitive explanation. Consider the Gibbs 
distribution associated with a (classical) tree-structured graphical model. Deleting any 
single node i breaks the tree into two or more disconnected components; moreover, 
variables in different components are independent conditioned on the variable at node 
i. Thus the correlations among variables have a simple structure (they are a Markov 
random field). However, this is no longer true when one considers the Gibbs state of a 
quantum Hamiltonian, even when interactions are restricted to lie on a tree. 

Finally, we consider the class of "stoquastic" quantum systems, introduced in [22|[2T]. 
A Hamiltonian is called "stoquastic" if all of its off-diagonal matrix elements (relative to 
the standard basis) are less than or equal to 0. By the Perron- Frobenius theorem [14j . 
this implies that the ground state can be chosen to have the form \ip) = Cz\z), where 
\z) are the standard basis states and the coefficients Cz are all real and nonnegative. Since 
the coefficients Cz all have the same complex phase, they can be viewed as analogous to 
a probability density, with = 1. 

Stoquastic Hamiltonians appear in many natural physical systems, as well as some 
versions of the adiabatic algorithm for combinatorial optimization [35] . However, there is 
some evidence that the Local Hamiltonian problem in this case is not QMA-hard. Bravyi 
et al showed that Stoquastic Local Hamiltonian is in AM, a class which is believed 
to lie "just above" NP in the polynomial hierarchy. If Stoquastic Local Hamiltonian 
were QMA-hard, this would imply that QMA is in AM, which is possible but perhaps a 
little unlikely. 

On the other hand, Bravyi et al [22j also showed that Stoquastic Local Hamiltonian 
is MA-hard, so it cannot be very much easier than general Local Hamiltonian. Indeed, it 
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could be that Stoquastic Local Hamiltonian is QMA-hard, and we are simply ignorant. 
(However, such ignorance may be long-lived. It is still an open problem to show that 
BQP is in the polynomial hierarchy, a much weaker result that would follow trivially if 
QMA were in AM.) 

We propose a stoquastic version of the Consistency problem. We believe this problem 
ie equivalent to Stoquastic Local Hamiltonian (up to poly-time oracle reductions), and 
we give a heuristic argument, modulo some technical details, for why this should be true. 
(This is section 4.4.) 

4.2 Reductions from Consistency to Local Hamiltonian 

Consider the standard versions of the Consistency and Local Hamiltonian problems, as 
defined in Chapter 2. Previously we gave reductions from Local Hamiltonian to Con- 
sistency (thus showing that Consistency is QMA-hard); now let us consider reductions 
in the opposite direction. One way is to use the QMA-hardness of Local Hamiltonian 
|53l I51j : since Consistency is in QMA, one can "encode" an instance of Consistency 
into an instance of Local Hamiltonian. Here we will give a different reduction, based on 
Lagrange duality, which does not involve QMA-hardness. This reduction illustrates a 
simple and quite transparent relationship between the two problems, which is interesting 
in its own right. It will also be useful in dealing with special cases of these problems 
which are not QMA-hard. 

The idea comes from a theorem of "strong alternatives" in semidefinite programming 
|17| . Let Fi,...,F(i be complex Hermitian matrices of dimension N. Consider the 
following matrix inequality: 

d 

Y^XiF^ + I^O, (4.1) 

i=l 

where x € M*^ is a variable. (Notation: M X means M is strictly negative definite, 
M y means M is positive semidefinite, etc.) Also consider the following system of 
inequalities: 

ZhO, Zy^O, tr(FiZ) = (Vi = 1,... ,d), (4.2) 

where Z, a complex Hermitian matrix of dimension N, is a variable. The theorem states 
that exactly one of the two inequalities (14. ip and ()4.2p is feasible. In other words, if 
(|4.2p is feasible, then (|4.ip is not; and if (|4.2p is not feasible, then ()4.ip is. (When this 
property holds, we say that ()4.ip and (14. 2p are strong alternatives.) 

Observe that inequality (j4.2p can be used to express the Consistency problem: Z 
is a global density matrix (unnormalized, but note that all the constraints remain the 
same if we divide across by tr(Z)), and we can choose the constraints tr(FjZ) = to 
ensure that Z agrees with the desired local density matrices (note that the matrices Fi 
will then be local observables). But now the expression Yli=i ^iFi + I in inequality (j4.ip 
is simply a local Hamiltonian, and estimating its largest eigenvalue is precisely the Local 
Hamiltonian problem (modulo a sign flip). So a Local Hamiltonian oracle allows us to 
test membership in the convex set defined by inequality (j4.ip : and, using the methods of 
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convex optimization described in Chapter 2, we can then decide the feasibihty of ()4.ip . 
Since (j4.ip and (j4.2p are strong alternatives, this lets us solve the Consistency problem. 

This is the intuition, but some further work is needed to make it rigorous. We have 
to allow for the inverse-polynomial precision in the Consistency and Local Hamiltonian 
problems. Also, in order to do convex optimization with a membership oracle, the set of 
feasible solutions K must satisfy certain geometric properties. So we have to formulate 
inequality (|4.ip in a different way. We will show a finite-precision, "algorithmic" version 
of the theorem of strong alternatives. 

Theorem 4.1 There is a poly-time oracle reduction from Consistency to Local Hamil- 
tonian. 

Proof: First, recall the statement of the Consistency problem: 

We have a system of n qubits, and we are given local density matrices 
pi,...,Pm, where pi describes the subset of qubits Cj C {!,..., n}. (We 
assume \Ci\ < k for some constant k.) 

In addition, we are given a string "1*" and a real number /3 > 1/s. (All 
numbers are specified with poly(n) bits of precision.) The problem is to 
distinguish between the following two cases: 

• There exists an n-qubit state a such that, for all i, trji^ (o") = pi. 
In this case, answer "YES." 

• For all n-qubit states a, there exists an i such that ||tr|;^ (cr) — 
Pi 111 ^ P- In this case, answer "NO." 

As before, we consider the n-qubit Pauli matrices P = 'S)7=i where Pi E {/, X, Y, Z}. 
We say that P is supported inside a subset C C {1, . . . , n} if, for all i ^ C, Pi = I. Then 
we define S to be the set of "local" Pauli matrices, excluding the identity matrix, 

m 

S = {^{P I P is supported inside Cj} — {/}. 

i=l 

We also let d = \S\, and note that d < 4'^m — 1. These are the local observables, 
and knowing their expectation values is equivalent to knowing the local reduced density 
matrices pi, . . . , pm- 

Suppose we have an instance of the Consistency problem. For each observable P £ S, 
we define ap to be the desired expectation value, which we compute as follows: pick 
some subset Ci such that P is supported in Cj, then set ap = tr(P/9j). 

Let us make a couple of observations. Clearly, if this is a "YES" instance of Consis- 
tency, then there exists an n-qubit state a such that, for all P £ S, ti{Pa) = ap. 

We claim that, if this is a "NO" instance of Consistency, then for all n-qubit states 
a, X^pg5 I tr{Pa) — ap\ > (3. This can be seen as follows. Note that, for any cr, there is 
some subset Ci such that \\a — pi\\i > (3, where a = tr|i^ (a). Using the matrix 
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Cauchy-Schwarz inequality fl8], — < \\a — pi\\2\2'^. By Fourier analysis, 

||^-p.||2 = -^( E tr(P(cr-p,))')'^' 
P supp. on d 

Pe-s Pes 
The claim follows by combining these inequalities. 

Next we write down a convex program and its dual. For each local observable P €z S, 
we define a new observable 

Fp = P-apI, 

which is shifted so that the desired expectation value now equals 0. We also define F(x) 
to be a linear combination of these observables, 



F{x) = E xpFp + I, for X G 



Pe-S 

Now consider the following convex program: 

Find some x G [-1, l]'' and s G [1 - 2(i, 1 + 2d] that 
minimize s such that F{x) ^ sF 

To see that this is a convex program, recall that the largest eigenvalue of F{x) is a 
convex function of x, since it can be written as the pointwise minimum over a family of 
affine functions of x. The variable s is redundant here, but it will play a role later when 
we apply algorithms to solve this program. We will refer to this as the primal program; 
let p* denote the optimal value of the objective function s. 
The dual program is as follows: 

Find some 2" x 2" complex matrix Z that 
maximizes g{Z) such that Z ^ and tr(Z) = 1, 

where the dual function g{Z) is given by 

g{Z) = inf s + tr(Z(F(x) - si)) 

xe[-i,i]'' 

sG[l-2d,l+2d] 

= inf tr{ZF{x)) 

a;G[-l,l]'^ 

= inf V xptr(ZFp) + 1. 

Let d* denote the optimal value of the objective function g{Z). Strong duality holds 
because the primal problem is convex and satisfies a generalized Slater condition [17J (to 
see this, note that the point (x,s) = (0,2) is strictly feasible, i.e., it lies in the relative 
interior of the domain, and it satisfies F{x) -< si). Strong duality implies that p* = d* , 
i.e., the optimal values of the primal and dual programs are equal. 
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We now give a poly-time oracle reduction from Consistency to Local Hamiltonian. 
We show that Consistency reduces to the weak optimization problem WOPT*, which 
reduces to the weak membership problem W MEM* , which reduces to Local Hamilto- 
nian. 

First, suppose we have a "YES" instance of Consistency. Then there exists an n- 
qubit state a such that, for all P G 5, tr (Per) = ap. So in the dual program there exists 
some Z ^ 0, tr(Z) = 1, such that for ah P G 5, tr(ZPp) = 0. This implies g{Z) = 1, 
hence the dual program has optimal value d* >!. By strong duality, the primal program 
has optimal value p* >1. 

On the other hand, suppose we have a "NO" instance of Consistency. Then for all 
n-qubit states a, Ylp&s I tr(P(T) — ap\ > j3. So, in the dual program, for all Z such that 
Z ^ and tr(Z) = 1, we have that Ylpi^s I t^(^-^-P)l ^ l^-> which implies g{Z) <1 — (3. 
Thus the dual program has optimal value d* < 1 — (3. By strong duality, the primal 
program has optimal value p* < 1 — (3. 

So we have reduced Consistency to the problem of distinguishing between the two 
cases p* > 1 and p* < 1 — (3 for the primal program. This is an instance of the weak 
optimization problem WOPT^i^ over the convex set 

K = {(x, s) £ [-1, l]'^ X [1 - 2d, 1 + 2d] \ F{x) - si :< 0}. 

Now we will reduce WOPT* to W MEM* . First we need some bounds on the 
geometry of K. It is easy to see that K is contained within a ball of radius R = 
y/d+{l + 2dY <0{d). In addition, we claim that K contains a ball around the point 
(0, . . . , 0, 2) of radius r = 4pq:xy • To see this, consider an arbitrary point (y, t + 2) where 

y £R'^,t£R and v1MF+^ < 4(i+T)- ^he operator 

F{y) -{t + 2)I=J2 ypFp -tl-I 
Pes 

has all of its eigenvalues bounded above by Ylpes 4{d+i) \ \^p\\ + ^d+i) ~ ^ — ~h (using 
the fact that ||Pp|| < 2). Thus {y,t + 2) is in K. This proves the claim. 

So we have R/r < 0{d^). By theorem [331 WOPT*^^ reduces to WMEM"^ where 
(5 > poly(/3, with running time poly((i, 1/(3). 

Finally, we reduce WMEM* to the Local Hamiltonian problem. Observe that, since 
the Fp are local operators, F{x) is a local Hamiltonian. Given an oracle that solves 
the Local Hamiltonian problem, we can estimate the largest eigenvalue of F{x) (i.e., the 
smallest eigenvalue of — P(x)), and thus decide whether {x^s) is in the set K. 

Suppose we have a "YES" instance of WMEM"^. Then (x,s) E K, so F{x) < si, 
i.e., all eigenvalues of —F{x) are > —s. So this is a "NO" instance of Local Hamiltonian. 

Now suppose we have a "NO" instance of WMEM^ . Then (x,s) ^ S{K,5), and in 
particular, {x,s + 5) ^ K. So F{x) :^ {s + 5)1, i.e., —F{x) has an eigenvalue that is 
< —s — 5. So this is a "YES" instance of Local Hamiltonian. 

Note that ||P(x)|| < Epe^ll^^'ll + 1 < 2d + 1. Thus, WMEMl reduces to Local 
Hamiltonian with precision 5/ {2d + 1). 
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Thus we conclude that Consistency (with precision /3) reduces to Local Hamiltonian 
(with precision poly(/3, 1/d)), and the running time is poly((i, 1//3). Note that d < A^m 
is polynomial in the size of the input. □ 

4.3 Consistency for 1-D Systems 

Let us consider a 1-dimensional chain of n qudits (a qudit is a d-dimensional particle), 
with nearest-neighbor interactions (i.e., interactions between particle i and particle i + 1, 
for i = 1, . . . , n — 1). 

First consider the case of qubits {d = 2). The reduction from Local Hamiltonian to 
Consistency shown in chapter 2 (theorem 12. 12p . and the reverse reduction shown above 
(theorem 14. ip . both preserve the neighborhood structure of the problems — that is, each 
local term in the Hamiltonian corresponds to a local density matrix, and vice versa. 
Thus we have: 

Theorem 4.2 On a 1-D chain of qubits (d = 2), Local Hamiltonian and Consistency 
are equivalent (with respect to poly-time oracle reductions) . 

We will now sketch one way of extending these results to the case of qudits (d > 2). 
The first step is to define a set of observables for a single qudit, with nice properties 
similar to the Pauli matrices. Let i = 0, . . . , d — 1 denote the standard basis states 
for a single qudit. Also, let i (in plain, not italic type) denote the square root of —1. 

< i < i < d - 1 
Q <i < j < d-l 

-|i + l)(i + l|, {)<i<d-2 

Note that Zi is the diagonal matrix whose diagonal consists of ^ in the first i + 1 
positions, followed by —1, followed by in all the remaining positions. We have a total 
of 2(^) + {d-l)=d{d-l) + {d-l) = d'^ -I observables. 

These observables satisfy the following orthogonality relations: 



A 


B 


ti{AB) 


I 


I 
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Xki 
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Yki 
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Zk 





Xij 


Xki 


2 if (i,j) = (A;,Z); otherwise 


Xij 


Yki 





Xij 


Zk 
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Yki 


2 if (i,j) = (fe,^); otherwise 


Yij 


Zk 





Zi 


Zk 


1 + if i = k; otherwise 



Xij = + 
Yij = i\j){i\ -i\i){j\, 

a=0 
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In addition, note that \\Xij\\ = \\Yij\\ = \\Zi\\ = 1. 

We can now use these qudit observables in the same way that we used the PaTih 
matrices for qubits. We construct n-qudit observables by taking tensor products of 
single-qudit observables: 

n 

P = Pa, Pa € {/, Xij,Yij, Zi}. 

a=l 

Note that for any n-qudit observables P and Q, tr (PQ) = YY^=itr(PaQa)- Any n-qudit 
density matrix p can be written in the form 

^p2^P, ap = tT{Pp). 

We say that P is supported inside a subset C C {1, . . . , n} if for all i ^ C, Pi = I. 
If this is the case, we define P\c = (2)jgc which we call the "restriction" of P to the 
subset C. We can write the reduced density matrix for the subset C in the form 

P^^^ = t^l,...,n}-cip) 

EOlp 

P supported in C 

^ tr((P|c)2) 

P supported in C ^ ' ' 

Now we can use essentially the same reductions as before, from Local Hamiltonian 
to Consistency and vice versa, for systems of qudits. (Details omitted.) 



4.4 Consistency for Stoquastic Systems 

We say that a Hamiltonian is "stoquastic" if, when written in the standard basis, all 
of its off-diagonal matrix elements are less than or equal to 0. (Note that the diagonal 
elements can be made to be < by adding a multiple of the identity to the Hamiltonian; 
this shifts the eigenvalues but docs not change the eigenvectors.) This implies that the 
ground state can be chosen to have the form = Cz\z) where l^) are the standard 
basis vectors and Cz > 0. The Stoquastic Local Hamiltonian problem is simply the Local 
Hamiltonian problem with the additional promise that the local terms that make up the 
Hamiltonian are stoquastic. As discussed previously, this makes the problem potentially 
easier. 

In this section we propose a "stoquastic" version of the Consistency problem, that 
has the same complexity as Stoquastic Local Hamiltonian (up to poly-time reductions). 
We will describe a few different versions of the problem, all of which arc at least as 
hard as Stoquastic Local Hamiltonian. However, one version of the problem is especially 
interesting, because we believe it is no harder than Stoquastic Local Hamiltonian. We 
provide a heuristic argument, though not a formal proof. 

First, let us say that a density matrix is "stoquastic" if, when written in the standard 
basis, all of its off-diagonal matrix elements are greater than or equal to 0. (Its diagonal 
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elements must be > since the matrix is positive semidefinite.) Note that the set of 
stoquastic density matrices is convex. Now consider an obvious way of defining the 
stoquastic Consistency problem: 

Given local density matrices pi,...,pm, does there exist a global density 
matrix p that is stoquastic and agrees with pi, . . . , p^? 

Stoquastic Local Hamiltonian reduces to this problem, using an argument like the one 
in chapter 2. But this problem does not seem to be in QMA, since the verifier does not 
have a way to test whether p is indeed stoquastic. 

Another way of defining the stoquastic Consistency problem is as follows: 

Given local density matrices pi, . . . , pm which are stoquastic, does there exist 
a global density matrix p that agrees with pi, . . . , 

Again, Stoquastic Local Hamiltonian reduces to this problem. Unlike our previous 
attempt, this problem is in QMA. However, it is not clear whether this problem reduces 
to Stoquastic Local Hamiltonian; when we apply the technique from section 14.21 we 
instead get a reduction from this problem to standard Local Hamiltonian. 

It turns out that the most interesting way to define the stoquastic Consistency prob- 
lem is as follows: 

Given local density matrices pi,...,pm, does there exist a global density 
matrix p such that, for all i = 1, . . . , m, trji^ „}.„f7. (p) >e /Oj? (Here Cj is 
the subset of qubits described by pi, and >e denotes element- wise inequality 
between two matrices written in the standard basis; we assume all matrices 
are real.) 

This definition is a little unusual, but we believe that it has the following interesting 
properties. First, Stoquastic Local Hamiltonian reduces to this problem. Second, this 
problem is in QMA. Finally, this problem reduces to Stoquastic Local Hamiltonian. In 
the following sections we explain these statements, though we do not present a formal 
proof. 

4.4.1 Reducing from Stoquastic Local Hamiltonian to Stoquastic Con- 
sistency 

First we show a reduction from Stoquastic Local Hamiltonian to Stoquastic Consistency. 
The basic idea is as follows. We are given a local Hamiltonian H = X^ilLi -^«> where 
the Hi are real and stoquastic. Without loss of generality, we can assume Hi <e 
(we simply add a multiple of the identity to Hi). Now consider local density matrices 
pi, . . . , Pm, where pi acts on the same subset of qubits as H^. We want to find pi, . . . , pm 
that correspond to the ground state of H. Now consider the following convex program: 

Find pi, . . . , pm that minimize X^I^i ^^(HiPi), subject to two constraints: 

(1) For all i, Pi ^ and tr(/)j) = 1. 

(2) There exists a s.t. cr ^ 0, tr((T) = 1, and for all i, trji _„j_(^.((T) >e Pi- 
Here, pi is a 2l*^'l x 21'^'' real matrix, and a is a 2" x 2" real matrix. 
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We claim that this convex program is equivalent to the Stoquastic Local Hamiltonian 
problem. If H has an eigenstate |(/?) with eigenvalue < A, then the convex program has 
optimal value < A; to see this, set pi = trji^ \ip){ip\. On the other hand, if all the 

eigenvalues of H are > A + (5, then the convex program has optimal value > A + (5; to see 
this, observe that for any feasible pi, . . . , pm, we have YllLi ^^i^iPi) ^ X^i^i ^^{H^a) = 
tr (Ha), using constraint (2) and the fact that Hi <e 0. 

Finally, the task of solving this convex program reduces to Stoquastic Consistency. If 
we have an oracle for Stoquastic Consistency, we can use it to check whether constraint 
(2) is satisfied. This provides a membership oracle for the set K of feasible solutions, 
which then allows us to solve the convex program (theorem 13. 3p . The main technical 
detail is to formulate the problem so that the set K is full-dimensional, with inner and 
outer radii that satisfy R/r < poly(n). This can be done using a subset of the local 
Pauli observables, where we account for the constraint that the density matrices must 
be real; we omit the details. 

4.4.2 Reducing from Stoquastic Consistency to Stoquastic Local Hamil- 
tonian 

Next we show a reduction from Stoquastic Consistency to Stoquastic Local Hamiltonian. 
The reduction uses strong duality, as in Section 14.21 

The first step is to represent pi, . . . , pm as the expectation values of certain observ- 
ables. However, we use a different set of observables, instead of the Pauli matrices, so 
that we can deal with inequalities involving the matrix elements of pi. For each i, define 
the following observables acting on the subset Cf. 

x^^ = ^{\s){t\ + \t){s\), .,te{o,i}l^'l, .^t, 

where s ^ t denotes lexicographic order. We can think of these observables as acting on 
the full n-qubit system (we tensor them with the identity matrix). For any real n-qubit 
state a, the matrix elements of trji _ „}_(7-(o") are given by the expectation values of 
these observables: 

tT{xi]^a) = {s\tv{-,_n}-cM)\t)- 
Then the conditions for a "YES" instance of Stoquastic Consistency can be written as: 

tr(X«a) > {s\p,\t). 

(i) 

We let S be the set of all these observables X^/ , for all of the subsets Ci, i = 1, . . . ,m. 
We also let d = \S\. Note that ||^^^ || < 1. These observables do not have any nice 
orthogonality properties, but the reduction from Consistency to Local Hamiltonian does 
not require that. 

Next, we formulate a convex program, together with its dual. Define new observables 

F«=X«-(.|p,|t)/, 

which are shifted so that our goal is to satisfy the inequalities tr{Fgl^a) > 0. For 
notational convenience, let us refer to these observables as Fp, for p = 1, . . . ,d. Define 
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F{x) to be a linear combination of these observables, 

d 

F{x) = XpFp + I, for X e R'^. 
p=i 

We construct a convex program which is similar to the one in Section 14.21 except 
that we restrict x to lie in the domain [0, l]'^ instead of [—1, l]'^. 

Find some x G [0, 1]'* and s £[l-2d,l + 2d] that 
minimize s such that F{x) ^ sF 

This is the primal program; let p* denote the optimal value of the objective function s. 
The dual program is as follows: 

Find some 2" x 2" real matrix Z that 
maximizes g{Z) such that Z ^ and tr(Z) = 1, 

where the dual function g{Z) is given by 

d 

g{Z)= inf tr(ZF(x)) = inf V Xp tr(ZF„) + 1. 

Let d* denote the optimal value of the objective function g{Z). Strong duality holds 
because the primal problem is convex and satisfies a generalized Slater condition [T7j 
(to see this, note that the point {x^s) = ((l/3(i)l, 2) is strictly feasible). Strong duality 
implies that p* = d* . 

Now, suppose we have a "YES" instance of Stoquastic Consistency. Then in the 
dual program there exists some Z yO, tr(Z) = 1, such that for all p, tr(ZFp) > 0. This 
implies g{Z) = 1, hence the dual program has optimal value d* > 1. By strong duality, 
the primal program has optimal value p* > 1. 

On the other hand, suppose we have a "NO" instance of Stoquastic Consistency. 
Then for all Z such that Z ^ and tr(Z) = 1, there is some p such that ti(ZFp) < —(3, 
which implies g{Z) <1 — (3. Thus the dual program has optimal value d* <1 — (3. By 
strong duality, the primal program has optimal value p* < 1 — P. 

Thus it suffices to solve the primal problem. We claim that we can do this, given 
an oracle for Stoquastic Local Hamiltonian. Observe that the Fp are local operators, 
whose off-diagonal elements are all > 0. Thus —F(x) is a stoquastic local Hamiltonian, 
and we can use the oracle to estimate its ground state energy. This is equivalent to 
estimating the largest eigenvalue of F{x), which allows us to test whether the constraint 
F(x) :< si is satisfied. Thus we have a membership oracle for the set K of feasible 
solutions. Using a similar analysis to section 14.21 we can show that K has inner and 
outer radii that satisfy R/r < poly(n). Then, by theorem 13. 3^ this allows us to solve the 
primal problem. 

Acknowledgements: Thanks to Frank Verstraete and Daniel Nagaj for useful discussions. 
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Chapter 5 



Gibbs States and the Consistency 
of Local Density Matrices 

Suppose we have an n-qubit system, and we are given a collection of local density 
matrices pi,...,pm, where each pi describes some subset of the qubits. We say that 
pi, . . . , Pm are "consistent" if there exists a global state a (on all n qubits) whose reduced 
density matrices match pi, . . . , pm- 

We prove the following result: if pi, . . . , pm are consistent with some state a y 0, then 
they are also consistent with a state a' of the form a' = exp(Mi + • • ■ + Mm), where 

each Mi is a Hermitian matrix acting on the same qubits as pi, and Z is a normalizing 
factor. (This is known as a Gibbs state.) Actually, we show a more general result, on the 
consistency of a set of expectation values (Ti), . . . , (T^), where the observables Ti, . . . , 
need not commute. This result was previously proved by Jaynes (1957) in the context 
of the maximum-entropy principle; here we provide a somewhat different proof, using 
properties of the partition function. 

5.1 Introduction 

Many-body systems have an intriguing property: under the right circumstances, local 
interactions can conspire to produce long-range or global effects. This behavior leads to 
phase transitions in statistical mechanics, and it also appears in combinatorial problems 
such as 3-SAT. If we consider quantum systems, the situation is more complicated, 
due to non-commuting measurements and the possibility of entanglement. This leads 
to new kinds of quantum phase transitions [73j, and new examples such as the Local 
Hamiltonian problem [.7J- 

A basic question in all of these examples is: if we know local information about 
various parts of a system, what can we say about the system as a whole? This paper 
gives one answer to this question, for quantum systems. 

Suppose we have an n-qubit system, and we are given a collection of local density 
matrices pi, . . . , pm, where each pi describes a subset Cj C {1, . . . , n} of the qubits. We 
say that pi,...,pm are "consistent" if there exists a global state a (on all n qubits) 
whose reduced density matrices match pi, . . . , pm', in other words, for all i = 1, . . . , m. 
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tr{i,...,n}-C,(o-) = Pi- 

Clearly, if pi, ■ ■ ■ , Pm are consistent, then whenever two density matrices pi and pj 
describe overlapping subsets of qubits {CiCiCj ^ 0), they must agree on the intersection 
QnCj; that is, tT:c,-{c,nCj){Pi) = ^'^Cj-{C,nCj){Pj)- This gives a necessary condition for 
consistency. 

However, the above condition is not sufficient to guarantee consistency. To see this, 
consider the following example: we have three qubits, and we are told that qubits 1 and 2 
are in the Bell state \^^) = (|00) + |ll))/\/2, and qubits 2 and 3 are also in the same state 
|$+). More formally, let pA = |$+)(«>+|, A = {1,2}, and let pB = |$+)($+|, B = {2,3}. 
In this case, pA and pB both agree on qubit 2, since tri(p^) = 1/2 = tv3{pB)- But there 
is no state a on all three qubits such that tiC3{a) = pA and tii^a) = pb', one way to see 
this is to apply the strong subadditivity inequality, S{1, 2, 3) + S(2) < S{1, 2) + S{2, 3). 

Thus the consistency of pi, . . . ,pm would seem to be a more subtle question. We 
prove the following result: 

Theorem 5.1 If pi, . . . , pm are consistent with some state o" ;^ 0, then they are also 
consistent with a state a' of the form a' = (l/Z) exp(Mi + • • • + Mm), where each Mi is 
a Hermitian matrix acting on the qubits in Ci, and Z = tr(exp(Mi + • • • + Mm))- 

Here, o" ;^ means that o" is a positive definite matrix. The state a' is known as a Gibbs 
state. 

Essentially, this result says that a Gibbs state a' can simulate an arbitrary state 
(7^0, with respect to an observer who can only access subsets Ci, . . . ,Cm of the 
qubits. For example, consider a physical system with local interactions, described by a 
Hamiltonian H. It is easy to see that the ground state of H can be approximated by 
rj = (l/Z) exp{—(3H), for /? large; and since H is a sum of local terms, r/ is a Gibbs state. 
Our result extends this simple observation to a much more general setting. 

Actually, we prove the following more general result: Consider a finite quantum sys- 
tem, and let Ti, . . . ,Tr be observables (Hermitian matrices). Without loss of generality, 
assume that the collection of matrices I ,Ti, . . . ,Tr is linearly independent (over M). We 
say that a state p has expectation values ti, . . . ,tr if tr(Tjp) = ti for alH = 1, . . . , r. 

Theorem 5.2 // there exists some state p y which has expectation values ti, . . . ,tr, 
then there exists a state p' which has the same expectation values ti, . . . ,tr, and is of the 
form p' = (l/Z) exp(6liri + • • • + 9rTr), where 6*1, . . . , 6*^ G M. 

This statement holds even when the observables Ti, . . . ,Tr do not commute. 

This result was previously proved by Jaynes, as part of the maximum entropy prin- 
ciple in statistical mechanics \47\ I48j. Jaynes showed that the Gibbs state p' is the state 
which maximizes the entropy S{p) = — tT{plog p) subject to the constraints (Tj) = ti; 
implicitly, he also showed that the Gibbs state p' is always feasible, in the sense that it 
can produce the same expectation values (Tj) as an arbitrary state p y 0. 

However, Jaynes' motivation was somewhat different from ours. Jaynes was inter- 
ested in statistical mechanics, which deals with large systems with many degrees of 
freedom and only a few constraints. Feasibility is not usually a concern in such cases, 
while the maximum-entropy property is crucial in making plausible inferences about the 
"true" state of the system. 
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In this paper, we focus on finite quantum systems, with many non-commuting con- 
straints; we are interested in the relationship between local constraints and the global 
state of the system. For us, feasibility of the Gibbs state is important, since it is possible 
for the system to become overdetermined. Statistical inference is less important, because 
the systems we study are small enough that their state can be completely determined 
(at least in principle). Rather than viewing this as an inference problem, we can speak 
directly about what states are allowed under a given set of constraints. 

Finally, we prove our result using a technique which is different from Jaynes. Jaynes 
used the Lagrange dual of the entropy-maximization problem, while we use some analytic 
properties of the partition function. Our analysis bears some resemblance to classical 
results on exponential families in statistics [25] — although the technical details are quite 
different. Our proof also contains some geometric intuition which may be of interest. 

5.2 Proofs of our results 

First, we will review some useful facts about the partition function for a Gibbs state. 
Then we will prove theorem 15.21 and obtain theorem 15.11 as a special case. 

5.2.1 The partition function 

Recall the situation described in theorem 15.21 we have a finite quantum system, and 
observables Ti, . . . , T,., such that I ,Ti, . . . ,Tr are linearly independent (over M). We are 
interested in states of the form 

p{d) = exp{diTi + ■■■ + erTr)/Z{0), G W, 

where Z{6) = tr(exp(0iTi + • • • + 6rTr)). Z{9) is called the partition function, and we 
also define the log partition function il^{9) = \ogZ{6). 

Note that, in the above definition, we can translate each observable Tj by a multiple of 
the identity, without changing the state p{9). More precisely, if we define new observables 
Pi = Ti -\- Xil, with Aj G M, we have that: 

exp(giPi + • • • + BrPr) _ exp(giri + • • • + BrTr) 

tr(exp(0iPi + • • • + BrPr)) ~ tr(exp(^iri + • • • + ^..T,.)) ' 

Using subscripts T and P to denote the two sets of observables, we arrive at the same 
state, Pp{0) = Pt{S)-, although the partition functions are different, Zp{9) ^ Zt{0). 

The log partition function ^ has some nice analytic properties: it is convex, and its 
derivatives encode the expectation values of the observables Tj. We briefly sketch these 
results, which can be found in quantum statistical mechanics [48], as well as quantum 
information geometry [44j . 

Proposition 5.3 ip is convex onM^. 

Proof sketch: This follows from some facts in matrix analysis [18j. First, the Golden- 
Thompson inequality: If A and B are Hermitian matrices, then 

tr(exp(A + B)) < tr(exp(yl) exp(5)). 
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Next, a matrix version of Holder's inequality: For any matrix A, define the Probenius 
or Hilbert-Schmidt norm to be ||A||2 = {tTi:{A^ A))^/"^ . Also, let \A\ denote the unique 
positive semidefinite square root of A^A. Then we have that, for all square matrices A 
and B, 

\\AB\\2 < \\\A\P\\l^P\\\B\i\\l^\ 

for i + 1 = 1, p > 1. □ 

Proposition 5.4 ip is differentiable on M^, and 

||=tr(T,/,(e)) = (T,). 

Proof sketch: Use "parameter differentiation" [86]: If is a Hermitian matrix which 
depends on a parameter A, and dH/dX and d^H/dX^ exist and are continuous, then 
d{ex.Tp{H))/dX exists and is equal to 



d_ 
dX 



exp{H) = / exp{{l — u)H)—— exp{uH)du. □ 
Jo clX 



5.2.2 Proof of theorem 2 

Proof: We are given expectation values ti, . . . ,tr, and we want to find a state 

p'{6) = exp{6iTi + ■■■ + 9rTr)/Z'{e) 

that has these expectation values. (Here, Z'{9) is the partition function, and il^'{0) = 
log Z'(6) is the log partition function.) By translating the observables Tj, we can assume 
that ti = 0, for all z = 1, . . . , r. We can now restate the problem in terms of the log 
partition function: we are looking for some 9 such that 'Vip'{6) = 0. 

We know there exists a state p y which has the desired expectation values ti, . . . ,tr- 
Now choose some observables Ui, . . . ,Us, such that the set {/, Ti, . . . , T^, Ui, . . . ,Us} 
is complete and linearly independent (in other words, any 2^-dimensional Hermitian 
matrix can be written uniquely as a real linear combination of the matrices in this set). 
Let ui,...,Us be the expectation values of p for the observables Ui, . . . ,Us; that is, 
Ui = ti(Uip). By translating the Ui, we can assume that Ui = 0, for alH = 1, . . . , s. 

We will consider states of the form 

p{e, (p) = exp(0iTi + • • • + erTr+ 

ct>iUi + --- + <i)sUs)/z{eA)- 

(Here, Z{9,<j)) is the partition function, and %l){6,(j)) = log Z(9,(j)) is the log partition 
function.) Completeness of the Tj and the Ui implies that we can write p in the form 
p = p{9,(l)) for some (0,</>) € M''+*. This implies that V'ip{6,(j)) = for some {6,(j)) £ 

Furthermore, we claim that there is a unique point {9,(j)) such that p{9,(j)) has the 
expectation values U and Ui. This is because the expectation values ti and Ui uniquely 
determine the state p, and setting p = p{9, (p) uniquely determines the values of 9 and 
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(j). This in turn follows from the completeness and linear independence of the T, and the 
f/j. So we conclude that V^(0, 0) = at exactly one point {0,(j)). 

To complete the proof, we will carry out the following plan: we will show that 
ip{9,(f)) — > oo as ll^,*/*!! — > CO, where denotes the norm of the vector {9,(j)). This 

implies that ~^ oo as \\9\\ oo; and hence Vil^'{9) = for some 9 . (See figure 
15.11 for a simple example that shows the geometric intuition for the proof.) 

Let (^Oi'^o) be the unique point where VV' vanishes. We claim that (6*01 </'o) is the 
unique global minimum of tp. [Since il) is convex (proposition I5.3p . it follows that ip is 
bounded below, and {9o,(j)o) is a global minimum. Also, ip is differentiable everywhere 
on the domain M^"*"*, which has no boundaries (proposition 15. 4p : so any extremum {9, (j)) 
must satisfy V'4j{9,4>) = 0. But this happens only at (6*0, 0o)) and so (^Oi'/'o) is the 
unique global minimum.] 

Let S be the set of all unit vectors in 1^*"+*. Define the function /(i/, z) = 'iP{{9q, (po) + 
zi'), for u ^ S, and z G M. Say we fix z = 1. We claim that there exists some 6 > 
such that, for all u, f{i^, 1) > 1^(90, (po) + b. [Since (^O; (po) is the unique global minimum, 
we have that /{v, 1) > tp{9Q, (po), for all v. Moreover, /(i/, 1) is a continuous function of 
v, and 5 is a compact set, hence its image f{S, 1) is compact. Hence f{iy, 1) must be 
bounded away from iIj{9q,4)q), for all v.] 

Next we claim that, for all and for all z > 1, {df/dz){v,z) > b. [Fix any v. 
/{ly, z) is a differentiable function of z, so by the mean value theorem, there exists some 
z G (0, 1) such that {df/dz){v^ z) = /(i^, 1) — /(j^, 0) > h. In addition, since ip is convex, 
{df /dz){iy, z) is nondecreasing in z. This proves the claim.] 

Now, say we are given some {9,(j)), and assume that ||(0,0) — (^0)</'o)|| ^ 1- We can 
write {9, (p) in the form 

{9,(P) = {9o,<l>o) + \\i9,<l))-{9o,M\W, 
for some unit vector 1/ £ S. Then we have: 

^Pi9,cp) = f{,.,\\ie.<P)-{Oo,M\\) 

= f{v,l) + j^ {df/dz){v,z)dz 

> i,{9^,(t>^) + h + h{\\{9,ct>) - {9^,(Po)\\ - 1) 
= ^(0o,</'o) + 6||(e,0)-(eo,</'o)||. 

From this, we conclude that ^{9,(1)) — > 00 as ||0,0|| 00. 

Notice that the partition functions for p'{9) and p{9,<j)) are related: 

V^'(0) = ^(0,o). 

Hence, ip'{9) — > cxd as ||0|| 00. 

We will use the following fact: if / : M" — > M is continuous, and f{x) ^ 00 as 
\\x\\ — > 00, then / is bounded below, and / attains its minimum at some point G M". 
[To see this, let S = {x £ M" | f{x) < a}, choosing a large enough that S 7^ 0. Note 
that S is bounded; otherwise, there would exist a sequence {xj} such that 00 
and f{xi) < a, a contradiction. Also, note that S is closed; this is because the interval 
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(— oo,a] is closed, and / is continuous. So we have that S is compact. This imphes that 
f{S) is compact. Hence f{S) is closed and bounded; also note that f{S) / 0. This 
implies that / is bounded below, and attains its minimum.] 

From this, we conclude that ^' attains its minimum at some point 6^ £W". W has 
no boundaries, and ip' is differentiable everywhere on M'', so it follows that VV^'C^*) = 0. 
This completes the proof. □ 

5.2.3 Proof of theorem 15.11 

Proof: We will obtain theorem 15.11 as a special case of theorem 15.21 The basic idea 
is that specifying the local density matrices pi,...,pm is equivalent to specifying the 
expectation values of all Pauli matrices on the subsets Ci, . . . , Cm- 

Let X, Y and Z denote the Pauli matrices for a single qubit, and define V = 
{I,X,Y,Z}. We can construct n-qubit Pauli matrices by taking tensor products P = 
P]^ (g) • • • (g) P„ E -pi^n^ Any 2"-dimensional Hermitian matrix can be written as a real 
linear combination of n-qubit Pauli matrices. Furthermore, the n-qubit Pauli matrices 
are orthogonal with respect to the Hilbert-Schmidt inner product: tr(PtQ) = 2" if 
P = Q, and otherwise. 

We make the following claim: Let o" be a density matrix on n qubits, and let p 
be a density matrix on a subset of the qubits C C {l,...,n}, with \C\ = k. We 
claim that tr^i^ ^n}~c{^) — Pj if only if, for all Pauli matrices P on the subset C, 
tr((P (8) I)(t) = tr(Pp). (Notation: we write n-qubit Pauli matrices in the form P ® 
where P acts on the subset C, and Q acts on the rest of the qubits.) 

The (^) direction is obvious, but we need to show (<^). We write a and p as linear 
combinations of Pauli matrices, with real coefficients P(p^q) and ap: 

= ^ I3{P®Q)P ® Q 
p= ^ apP. 

P(i-ptSk 

We know that, for all Pauli matrices P on the subset C, tr((P ® I)a) = 2"'/?(p0/) = 
tr(P/)) = 2^ap. But this implies: 

tr{i,...,n}-c(^) = E 2"-'=/3(P«/)P 
= ^ apP = p, 

P(l'P<Sk 

which proves the claim. 

Thus, theorem 15.11 is a special case of theorem 15. 2|, where the observables Ti, . . . , 
consist of all the Pauli matrices on the subsets Ci , . . . , Cm ■ D 

Acknowledgements: I am grateful to Dorit Aharonov, Chris Fuchs and David Meyer for 
helpful discussions about this work. Funded by an ARO/NSA Quantum Computing 
Graduate Research Fellowship. 
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Figure 5.1: A single-qubit example. We want to find a Gibbs state p' that satisfies 
(cz) = —0.6; we have one observable T = az + 0.6. We know there exists some state 
p y that satisfies (az) = —0.6; in this case, p also satisfies (ax) = —0.3, and we let 
U = ax + 0.3 play the role of the "extra" observables. As the graph shows, Vip{9, cp) 
vanishes at exactly one point; tp'{9) = tp{6,0); and Vip'{6) vanishes for some 9. 
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Chapter 6 

Conclusions 



In this dissertation we have studied the complexity of the Consistency and A^-represent- 
abihty problems. We showed that these problems are QMA-complete, using reductions 
based on convex optimization with a membership oracle. In addition, we showed that 
certain special cases of Consistency and Local Hamiltonian have the same complexity 
(even though they are not known to be QMA-hard). 

A number of interesting open problems remain. Are there better reductions from 
convex optimization to membership? (In particular, are there reductions that have a less 
stringent precision requirement for the membership oracle?) Can one give a mapping 
reduction from Local Hamiltonian to Consistency, rather than an oracle reduction? Can 
one show that approximately solving Local Hamiltonian is QMA-hard? (This would be 
a quantum analogue of the celebrated PCP theorem [79l \T0\ [34j.) 

We are also starting to understand the complexity of special classes of quantum 
systems. In chapter 3 we remarked that translationally-invariant systems seem to be an 
easy special case. However, a recent result [50] shows that this is no longer true if one 
allows interactions involving log(n) particles, or particles that have n states — in these 
cases. Local Hamiltonian is once more QMA-complete. 

On the positive side, recent work suggests that there is a polynomial-time approxi- 
mation scheme (PTAS) for Local Hamiltonian on planar graphs [12j . even though solving 
the problem exactly is QMA-hard. Also, it seems likely that one can get a PTAS for 
Local Hamiltonian on a 1-D chain, by reducing to the Consistency problem and applying 
p-positivity conditions [61] . 
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